Newsgroup sci.stat.consult 21961

Articles

Subject: Re: kappa
From: Chauncey Parker <"chaunce\"removethis\""@u.washington.edu>
Date: Tue, 14 Jan 1997 00:16:48 -0800

as you have clearly pointed out, I am obviously a dunce about this;
and thus:
I still wonder about the case were two raters may have the same mean yet poor 
ICC coefficient and I wonder about the case where 
raters 1, 2, and 3, could have means that were in that order, that is,
mean 2 could be closer to mean 1 than mean 3 but:
means 1 and 3 could have better reliability (w/ ICC) than means 1 and 2;
I still am not sure why you would rather do a t-test and a Pearson's C
Since in calculating an ICC you would have the results of an ANOVA to test " 
if one rater or rating is systematically higher than
another." and you would have a correlation coefficient that as I understand 
the liturature to indicate contains more information than Pearson's since the 
ICC incorporates association and agreement rather than only association.
Chauncey wrote:
> : snip . . .
> 
> : If your measure is interval like, ICC is the interrater reliabability
> : stat to use; I would think.  but alas, I'm still quite a naive student.
Richard F Ulrich wrote:
> 
> There seems to me that there must be some blindness in the way that
> "reliability"  is being taught, because the point that I was making is
> a simple one...  yet, this is not the first time that it has been
> missed.
> 
> The intra-class correlation (ICC)  is a fine measurement for
> publishing what you have achieved in "reliability".  Unfortunately,
> it does nothing to illustrate or test or separate out the
>  *systematic differences*  that may occur between raters - they
> just serve to lower the correlation slightly, since the ICC makes
> the assumption that the raters have equal means.
> 
> In almost any kind of work that I can think of, it *ought*  to be a
> concern if one rater or rating is systematically higher than
> another.  The powerful way to test this is with the paired t-test;
> the concommitant statistic to the paired t-test is the Pearson
> correlation  -  together they give both aspects of comparing the
> ratings, SIMILARITY and DIFFERENCE.
> 
> So, the ICC may be what editors want to see, and it is okay as
> a one-number summary, but anyone examining their own reliability
> data has little excuse (IMHO) not to look at tests of difference,
> where they are appropriate.
> 
> Rich Ulrich, biostatistician                wpilib+@pitt.edu
> http://www.pitt.edu/~wpilib/index.html   Univ. of Pittsburgh

Return to Top

Subject: Re: LOWESS regression
From: Ronan Conroy
Date: Tue, 14 Jan 1997 09:38:32 +0000

ANNE KNOX wrote:
> Sorry to bombard this newsgroup with regression questions!
>
> I'm looking for statistical programs that can perform locally weighted
> (LOWESS) regressions.  Any suggestions?  Also, are there any references
> that discuss statistical testing of LOWESS regressions?
>
Lowess smoothing is available in Data Desk and Stata (under the ksm
command).
Smoothing is a way of filtering your data to look for signal within the
noise. It starts from the opposite position to classical regression in
which the form of the function is specified in advance and the parameters
are estimated from the data. For that reason it does not test a specified
model and therefore isn't a hypothesis test.
A smoother is very important, however, when you are fitting a model. It
allows you to see if, for example, the variables really seem to take on a
linear relationship throughout their range. It is useful for detecting
phenomena which might otherwise go unnoticed, such as a threshold effect
where a relationship abruptly changes direction or magnitude. Using a
smoother is a useful check that your model isn't a serious
misrepresentation of your data.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
    _/_/_/      _/_/     _/_/_/     _/     Ronan M Conroy
   _/    _/   _/   _/  _/          _/      Lecturer in Biostatistics
  _/_/_/    _/          _/_/_/    _/       Royal College of Surgeons
 _/   _/     _/              _/  _/        Dublin 2, Ireland
_/     _/     _/_/     _/_/_/   _/         voice +353 1 402 2431
                                           fax   +353 1 402 2329
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Get a life, they said, but I couldn't find the ftp site...

Return to Top

Subject: Re: LOWESS regression
From: ghaverla@freenet.edmonton.ab.ca ()
Date: 14 Jan 1997 13:22:42 GMT

David Rothman (nyrtd@ny.ubs.com) wrote:
: ANNE KNOX wrote:
: > 
: > I'm looking for statistical programs that can perform locally weighted
: > (LOWESS) regressions.  Any suggestions?  Also, are there any references
: > that discuss statistical testing of LOWESS regressions?
: look at the bell labs site.  there are papers and code by cleveland.
Program in question is LOESS (REAL*8 version is DLOESS).
It is in the netlib repository.  The PostScript manual
goes through several examples of statistical analysis,
and has a bibliography.  I've presently misplaced my copy,
so that is all I can give you for now.
--
Gordon Haverland, B.Sc. M.Eng.       Area: Materials Science and Engineering 
101  9504 182 St. NW                 Professional Status: Available
Edmonton, AB, CA  T5T 3A7
403/481-8019                         Email: ghaverla@freenet.edmonton.ab.ca

Return to Top

Subject: Contract Work Wanted: MFC, OWL, C++, ActiveX, Visual Basic 3,4,5, MS Access, Client Server, HTML
From: dmarkham@roanoke.infi.net (Daniel B. Markham)
Date: Tue, 14 Jan 1997 14:42:42 GMT

Bedford Technology Group is a customer-oriented corporation focusing
on software development for medium to large businesses. We do full
life-cycle consulting, managing small (5 - 10 contractor) programming
projects, and working as part of a larger team.
We specialize in Object-Oriented development, including data modeling,
OOA, and OOD, and provide complete solutions typically including
relational databases, client-server HTML configurations, and VB /
AcitveX front-ends.
We're a small company looking to help you out in a big way. For
further information, give us a call at 540-297-9187, drop by our web
page at http:\\www.infi.net\~dmarkham\ or respond via E-Mail.
Thank you for your time and consideration.
Dan Markham
President,
Bedford Technology Group

Return to Top

Subject: 1997 New Researchers' Conference
From: cocteau@research.att.com (Mark Hansen)
Date: Tue, 14 Jan 1997 13:30:58 GMT

                   CONFERENCE ANNOUNCEMENT
   The Third North American Conference of New Researchers
                       July 23-26, 1997
                       Laramie, Wyoming. 
The purpose of this meeting is to provide a venue for recent
Ph.D. recipients in Statistics and Probability to meet and share their
research ideas. All participants will give a short expository talk or
poster on their research work.  In addition, three senior speakers
will present overview talks.  Anyone who has received a Ph.D. after
1992 or expects to receive one by 1998 is eligible.  The meeting is to
be held immediately prior to the IMS Annual Meeting in Part City, Utah
(July 28--31, 1997), and participants are encouraged to attend both
meetings.  Abstracts for papers and posters presented in Laramie will
appear in the IMS Bulletin.
The New Researchers' Meeting will be held on the campus of the
University of Wyoming in Laramie, and housing will be provided in the
dormitories.  Transportation to Park City will be available via a
charter bus.  Partial support to defray travel and housing costs is
available for IMS members who will also be attending the Park City
meetings, and for members of sponsoring sections of the ASA.
Additional information on the conference and registration is available
at the website: http://www.math.unm.edu/NR97.html.  Or contact
Prof. Snehalata Huzurbazar, Department of Statistics, University of
Wyoming, Laramie, WY 82071-3332, USA; email: lata@uwyo.edu; fax:
307-766-3927.
This meeting is sponsored in part by the Institute of Mathematical
Statistics; the National Science Foundation, Statistics and
Probability Program; the ASA Section on Bayesian Statistical Sciences;
the ASA Section on Statistical Computing; and the ASA Section on
Quality and Productivity.
-----------------------------------------------------------------------------
 Room 2C-260, Bell Laboratories
 Innovations for Lucent Technologies    Phone: (908) 582-3868
 700 Mountain Avenue                    Fax:   (908) 582-3340 
 Murray Hill, NJ 07974                  Email: cocteau@research.bell-labs.com
 URL: http://cm.bell-labs.com/who/cocteau/index.html

Return to Top

Subject: Re: delete one jackknife for multiple regression, matrix inversion in QuickBasic 4.5
From: Gary McClelland
Date: Mon, 13 Jan 1997 11:47:50 -0700

Jack Hayes wrote:
> 
> I'm hoping to find a delete one jackknife program for multiple
> regression.  If anyone out there knows where I can get a shareware
> program to do this, please let me know.
> 
It is relatively easy to make one's own using formulas for
residual analysis.  The studentized deleted residual is the value
of t for adding a dummy variable to predict the deleted observation.
The formulas for the studentized deleted residual make it clear
that if you know the hat matrix, then a new residual sum of
squares with any observation deleted is easy to compute.
gary
-------------------------------------------------------------------
Gary.McClelland@Colorado.edu		Dept of Psychology, CB345
http://psych.colorado.edu/~mcclella/    Univ of Colorado
voice: 303-492-8617			Boulder, CO 80309-0345
fax:   303-492-5580			USA
------------------------------------------------------------------

Return to Top

Subject: Re: Signs of eigenvectors
From: Kyosti Huhtala
Date: Tue, 14 Jan 1997 07:41:49 +0200

Central Inst for the Deaf wrote:
> 
> Hello,
> 
> I computed the eigenvalues and eigenvectors of the following
> covariance matrix using Matlab and code from numerical recipes in 'c'.
> 
> They both return the same eigenvalues, but the signs of the eigenvectors
> of the 2 smallest eigenvalues (0.0238, 0.0782) are reversed. Can
> someone shed some light on this for me?
> Thanks
> Don
If v is an eigenvector of matrix A with the corresponding eigenvalue e,
then Av = ev, but also A(-v) = e(-v). Thus, also -v is an eigenvector
with the same eigenvalue e.
-- 
Kyösti
huhtala@jyu.fi, http://www.stat.jyu.fi/~huhtala

Return to Top

Subject: Help required on Bartlett Estimation and confidence intervals.
From: kleong@tartarus.uwa.edu.au (Weng Chong Leong)
Date: 14 Jan 1997 15:35:42 GMT

Hi,
I’m testing for the long-run neutrality of money using the following model
(Fisher and Seater (1993)):
[y(t) - y(t-k-1)] = a (k) + b(k)[m(t)-m(t-k-1)] + u(kt)  
where:
y = log of GNP
m = log of money supply
k = lag length.
If the estimated b(k) = 0, it can be said that money is neutral.
If the estimated b(k) not equal to 0, then money is not neutral to GNP.
My question is, what is a Bartlett estimator? 
b(k) can be estimated using OLS, why then do people use the Bartlett
Estimator?
‘The estimates of b(k) were obtained for k = 1 to 30, and 95-percent
confidence intervals corrected by the Newey-West technique were
constructed from a t-distribution using n/k degrees of freedom’.
Why is the degrees of freedom = n/k where n = sample size?
Has it got anything to do with the Bartlett estimator?
Many thanks.
Replies via email will be much appreciated.
Kenneth.

Return to Top

Subject: Re: Multi-Dimensional Scaling
From: hamer@rci.rutgers.edu (Robert Hamer)
Date: 14 Jan 1997 12:02:55 -0500

Clay Helberg  writes:
>BTW, a good reference for MDS is Young & Hamer (1987), _Multidimensional
>Scaling: History, theory, and applications_. (Lawrence Erlbaum)
Thank you.
-- 
--(Signature)      Robert M. Hamer hamer@rci.rutgers.edu 908 235 4218
  Do not send me unsolicited email advertisements.  I have never and
  will never buy.  I will complain to your postmaster.
  "Mit der Dummheit kaempfen Goetter selbst vergebens" -- Schiller

Return to Top

Subject: Delivery Error
From: postmaster@MAIL.CATO.COM
Date: Tue, 14 Jan 1997 12:38:20 -0500

A message To: rjmcnaill@mail.cato.com
        From: -maiser-@crl (mailserver at crl)
     Subject: Message not delivered
Produced the following MHS/SMF delivery notification:
   101: Unknown user at destination host
---------------------[ returned message ]--------------------
Error #101 has occurred attempting to deliver a message to
RJMCNAILL@CRL
You have sent an electronic mail message to RJMCNAILL@CRL.=0AThis user is n=
ot known to mail server CRL.
Original Message Follows:=0A=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0ASMF-70=0A230Sender: STAT-L @ SMTP (Statist=
ics and statistical discussion list: STAT-L) {STAT-L@VM1.MCGILL.CA}=0A242Se=
nd-to: TSOEDER @ CRL, RJMCNAILL @ CRL=0A010Date: 14-Jan-97 12:27:00 -0500=
=0AFrom: LISTSERV @ SMTP (Automatic digest processor) {LISTSERV@VM1.MCGILL.=
CA}=0ATo: STAT-L @ SMTP (Recipients of STAT-L digests) {STAT-L@VM1.MCGILL.C=
A}=0AReply-to: STAT-L @ SMTP ("Statistics and statistical discussion list: =
STAT-L") {STAT-L@VM1.MCGILL.CA}=0A001Subject: STAT-L Digest - 13 Jan 1997 t=
o 14 Jan 1997 - Special issue=0AMessage-ID: 0BF7D061017F420F=0A20MCB-option=
s: YNYNANA=0AO-SMTP-Envelope-From: =0A=0AReturn=
-Path: 
Received: from VM1.MCGILL.CA by mail.cato.com
        via Connect2-SMTP 4.00 (00000A2); Tue, 14 Jan 97 12:31:19 -0500
Received: from VM1.MCGILL.CA by VM1.MCGILL.CA (IBM VM SMTP V2R3)
   with BSMTP id 7354; Tue, 14 Jan 97 12:27:10 EST
Received: from VM1.MCGILL.CA (NJE origin LISTSERV@MCGILL1) by VM1.MCGILL.CA=
 (LMail V1.1d/1.7f) with BSMTP id 8374; Tue, 14 Jan 1997 12:27:04 -0500
Date:     Tue, 14 Jan 1997 12:27:00 -0500
Sender:   =
 "Statistics and statistical discussion list: STAT-L"          
Reply-To: =
 "Statistics and statistical discussion list: STAT-L"          
From:     Automatic digest processor 
Subject:  STAT-L Digest - 13 Jan 1997 to 14 Jan 1997 - Special issue
To:       Recipients of STAT-L digests 
There are 12 messages totalling 2020 lines in this issue.
Topics in this special issue:
  1. Probability and Wheels: Connections and Closing the Gap
  2. Delivery Error
  3. The New Palgrave's Time Series and Statistics Book FS
  4. LOWESS regression (2)
  5. Multi-Dimensional Scaling
  6. population census/estimates
  7. Comparing R^2 for different DVs?
  8. Combining Neural/Fuzzy Models with Statistical Models
  9. Power for repeated measures designs
 10. Help--Application of a characteristic function
 11. 1997 New Researchers' Conference
----------------------------------------------------------------------
Date:    Mon, 13 Jan 1997 22:16:31 GMT
From:    Uenal Mutlu 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
>It would be useful if we had a simulation software which for example
LOTSIM - Simulation-Program for all pick-X type Lottery Games
Ok, a very first version of the mentioned LOTSIM Simulation Program for
Lotto is ready. Email me if anyone wants to try it out (warning: IMHO
useful only for mathematically interessted people; runs as a commandline
program in a DOS-Box under Win95 or WinNT. ZIPed approx. 60 KB, EXE only).
(BTW, Karl Schultz posted a similar C-program some days ago in r.g.l.)
Below are 2 runs of simulations. The difference is: in the first run
fixed ticket numbers were used during all draws in each of the trials,
whereas in the second simulation, in each draw the ticket was randomly
chosen.
The values of the first are slightly better, but should this difference be
really significant? IMHO yes, but I'm not yet sure if the differences are
also statistically significant.
It seems that fixed tickets give a higher degree of success than random
tickets. Can this be true, or is the difference not really significant?
(Any comments from statisticians? Which significance testing method would
be appropriate for this?)
*******************************************************************
LOTSIM v1.00beta - Math Simulations for Lotto 6/49, 6/54, 5/32 etc.
Author/(c): U.Mutlu (bm373592@muenchen.org)
Limits    : vMax=3D54 kMax=3D7
Tested    : with 6/49 under Win95
Usage     : LOTSIM cTrials cDraws cWin v k fTicketsFixed ...
Cmdline   : LOTSIM 25000 54 3 49 6 1 ...
Simulation settings:
 v=3D49 k=3D6 cDraws=3D54 cTicketsPlayedPerDraw=3D1 fTicketsFixed=3D1
 cTrials=3D25000
 A ticket is defined as having 6 different nbrs 1..49
 Draw numbers are generated by the standard RNG, ie. the rand()
 function. Seed (srand(time)) is done once at pgmstart.
 Bonus Number not drawn and not calculated
 A 'win' is defined as having >=3D 3 matching on a ticket
Simulation results:
  wins occurance      %       cumul%
  ---- --------- --------- ---------
 >=3D 10         0   0.00000   0.00000
     9         0   0.00000   0.00000
     8         0   0.00000   0.00000
     7         1   0.00400   0.00400
     6        10   0.04000   0.04400
     5        68   0.27200   0.31600
     4       356   1.42400   1.74000
     3      1556   6.22400   7.96400
     2      4745  18.98000  26.94400
     1      9261  37.04400  63.98800
     0      9003  36.01200 100.00000
       ---------
           25000
Interpretation:
 25000 trials were made, each consisted of 54 drawings and in
 each drawing 1 fixed ticket was played. Simulations were done
 for Lotto 6/49. A win is defined as having >=3D 3 matching numbers.
 The table says for example "after 25000 trials of 54 drawings
 each, there were 9261 cases where only 1 draw in each series
 (ie. of ea. 54) had a win". And, we're dealing with "at least" wins.
*******************************************************************
LOTSIM v1.00beta - Math Simulations for Lotto 6/49, 6/54, 5/32 etc.
Author/(c): U.Mutlu (bm373592@muenchen.org)
Limits    : vMax=3D54 kMax=3D7
Tested    : with 6/49 under Win95
Usage     : LOTSIM cTrials cDraws cWin v k fTicketsFixed ...
Cmdline   : LOTSIM 25000 54 3 49 6 0 ...
Simulation settings:
 v=3D49 k=3D6 cDraws=3D54 cTicketsPlayedPerDraw=3D1 fTicketsFixed=3D0
 cTrials=3D25000
 A ticket is defined as having 6 different nbrs 1..49
 Draw numbers are generated by the standard RNG, ie. the rand()
 function. Seed (srand(time)) is done once at pgmstart.
 Bonus Number not drawn and not calculated
 A 'win' is defined as having >=3D 3 matching on a ticket
Simulation results:
  wins occurance      %       cumul%
  ---- --------- --------- ---------
 >=3D 10         0   0.00000   0.00000
     9         0   0.00000   0.00000
     8         0   0.00000   0.00000
     7         0   0.00000   0.00000
     6        12   0.04800   0.04800
     5        67   0.26800   0.31600
     4       366   1.46400   1.78000
     3      1522   6.08800   7.86800
     2      4729  18.91600  26.78400
     1      9289  37.15600  63.94000
     0      9015  36.06000 100.00000
       ---------
           25000
Interpretation:
 25000 trials were made, each consisted of 54 drawings and in
 each drawing 1 random ticket was played. Simulations were done
 for Lotto 6/49. A win is defined as having >=3D 3 matching numbers.
 The table says for example "after 25000 trials of 54 drawings
 each, there were 9289 cases where only 1 draw in each series
 (ie. of ea. 54) had a win". And, we're dealing with "at least" wins.
-- Uenal Mutlu (bm373592@muenchen.org)   
   * Math Research * Designs/Codes * SW-Development C/C++ * Consulting *
   Loc: Istanbul/Turkey + Munich/Germany
------------------------------
Date:    Tue, 14 Jan 1997 00:06:38 -0500
From:    postmaster@MAIL.CATO.COM
Subject: Delivery Error
A message To: rjmcnaill@mail.cato.com
        From: -maiser-@crl (mailserver at crl)
     Subject: Message not delivered
Produced the following MHS/SMF delivery notification:
   101: Unknown user at destination host
---------------------[ returned message ]--------------------
Error #101 has occurred attempting to deliver a message to
RJMCNAILL@CRL
You have sent an electronic mail message to RJMCNAILL@CRL.=3D0AThis user is=
 n=3D
ot known to mail server CRL.
Original Message Follows:=3D0A=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=
=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D
=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D0ASMF-70=3D0A230Sender=
: STAT-L @ SMTP (Statist=3D
ics and statistical discussion list: STAT-L) {STAT-L@VM1.MCGILL.CA}=3D0A242=
Se=3D
nd-to: RJMCNAILL @ CRL, TSOEDER @ CRL=3D0A010Date: 14-Jan-97 0:00:12 -0500=
=3D0A=3D
From: LISTSERV @ SMTP (Automatic digest processor) {LISTSERV@VM1.MCGILL.CA}=
=3D
=3D0ATo: STAT-L @ SMTP (Recipients of STAT-L digests) {STAT-L@VM1.MCGILL.CA=
}=3D
=3D0AReply-to: STAT-L @ SMTP ("Statistics and statistical discussion list: =
ST=3D
AT-L") {STAT-L@VM1.MCGILL.CA}=3D0A001Subject: STAT-L Digest - 12 Jan 1997 t=
o =3D
13 Jan 1997=3D0AMessage-ID: 4AF0D061017F420F=3D0A20MCB-options: YNYNANA=3D0=
AO-SMT=3D
P-Envelope-From: =3D0A=3D0AReturn-Path: 
Received: from VM1.MCGILL.CA by mail.cato.com
        via Connect2-SMTP 4.00 (00000A2); Tue, 14 Jan 97 00:04:04 -0500
Received: from VM1.MCGILL.CA by VM1.MCGILL.CA (IBM VM SMTP V2R3)
   with BSMTP id 6529; Tue, 14 Jan 97 00:00:21 EST
Received: from VM1.MCGILL.CA (NJE origin LISTSERV@MCGILL1) by VM1.MCGILL.CA=
=3D
 (LMail V1.1d/1.7f) with BSMTP id 7981; Tue, 14 Jan 1997 00:00:15 -0500
Date:     Tue, 14 Jan 1997 00:00:12 -0500
Sender:   =3D
 "Statistics and statistical discussion list: STAT-L"          
Reply-To: =3D
 "Statistics and statistical discussion list: STAT-L"          
From:     Automatic digest processor 
Subject:  STAT-L Digest - 12 Jan 1997 to 13 Jan 1997
To:       Recipients of STAT-L digests 
There are 33 messages totalling 1511 lines in this issue.
Topics of the day:
  1. How does intercept affect SST and R-square in ANOVA models?
  2. Simpson's paradox, anyone?
  3. 4253H filter
  4. Probability and Wheels: Connections and Closing the Gap (6)
  5. /method subcommand in spss manova
  6. SAS SAS SAS longterm contracts in Phoenix, AZ.!!!!!!
  7. LOWESS regression
  8. Best Design of Experiments software?? (2)
  9. kappa (2)
 10. Regression for Error in Y and X
 11. Comments needed on using databases for data entry
 12. hypergeometric distributions: hyp. tests and inference?
 13. subscribe
 14. An internet conference on Quality (2)
 15. TRatios in PROC ARIMA
 16. testing diff between partial reg coeffs in different samples
 17. Joint confidence interval-degrees of freedom
 18. AOV table in QPRO
 19. Multi-Dimensional Scaling
 20. Parameter Estimation by Sequential Testing
 21. Regression question: why R^2 =3D3D r^2?
 22. Power Analysis
 23. Response Surface Methods Conference! June 19 - 21, 1997
 24. gen'zd lm multiple comparisons
 25. Power for repeated measures designs
----------------------------------------------------------------------
Date:    Fri, 10 Jan 1997 15:53:52 GMT
From:    Richard F Ulrich 
Subject: Re: How does intercept affect SST and R-square in ANOVA models?
mdg1@Lehigh.EDU wrote:
: Hi,
: I have an ANOVA model and I notice that there's a huge difference in the
: SST(sum of square for total) and R-square values before and after specify=
=3D
ing
: the intercept term in my model. Why?
The SST is ordinarily the sum of squared deviations around the MEAN, and
that is what you are seeing when you include the intercept term; which
is the ordinary model that everyone expects.  You state "ANOVA model"
but I assume that you are talking about ANOVA-by-regression methods;
it seems to me that the "intercept" of an ANOVA would depend on how
you dummy-code the groups, and the only prospect that seems to me
to make sense is to account for the intercept.  -  The basic ANOVA
model is to compare the variation around the overall mean, i.e.,
intercept, to the sum of variations around separate means.
For regression, in general, prediction of  Y from one or more X:
If you do not have an intercept term, then you might compute SSTo  as
the sum of squared deviations around ZERO  -  clearly a much larger
number when the mean of Y is not near zero.
Now, your SSEffect  is computed as  (SST-SSResidual),
and R-squared is computed as  (SSEffect/SST), so
 a) it makes a big difference which version of SST you are using,
and b) any computer package should write out a warning when it is
using SSTo .
If SST is the larger number (SSTo), you have a larger SSEffect, and
a larger R-squared.  If SST is based on the mean, it is possible to
compute SSEffect as negative, and see a 'negative R-squared',
whenever the explanatory variable does worse than the mean would  -
This seems to me to be the preferrable way to compute and present
results, unless you are in some (rare) circumstance where there are
definite reasons to always omit the mean from the model.
Rich Ulrich, biostatistician                wpilib+@pitt.edu
http://www.pitt.edu/~wpilib/index.html   Univ. of Pittsburgh
 -----------------------------
Date:    Fri, 10 Jan 1997 17:11:52 GMT
From:    Dean Nelson 
Subject: Re: Simpson's paradox, anyone?
In article <32d55077.194086637@news.zippo.com>, j_weedon@escape.com says...
>
>Someone's just referred me to Simpson's Paradox. I can't find any
>reference to it in my texts - can anyone explain it for me?
>
>TIA,
>Jay Weedon.
There is a good example in Bishop, Fiengold, and Holland "Discrete
Multivariate Analysis".  The most famous example had to do with the
admittance rate of females to Cal Tech (I think).  Although
college wide, a greater proportion of males were admitted, when broken
down to individual schools, females were consistently admitted at a higher
rate (i.e. a greater proportion of applicants were admitted)  This 'paradox=
=3D
'
is attributed to the fact that the schools had vastly different applicant
pools, the harder sciences having many more male applicants.
Another example is the electorial college.  It is possible to lose the
election with a popular majority.
Dean
 -----------------------------
Date:    Fri, 10 Jan 1997 10:27:53 -0500
From:    Paul Velleman 
Subject: Re: 4253H filter
In article <5b3v2m$1c7@bermejo.cibnor.mx>, delmonte@cortes.cibnor.mx (Pablo
Del Monte [BM]) wrote:
> Does anybody can explain to me how the 4253H filter works? I'm looking
> beyond the definition of the user's manual of STATISTICA.
> I will appreciate very mouch any kind of help.
Yes.  I originally proposed this filter and did the basic work on it in my
dissertation.  The best published reference is
RDefinition and Comparison of Robust Nonlinear Data Smoothing AlgorithmsS,
Journal of the American Statistical Association, 75, September 1980,
609P615.
This gives both the definition and the rationale for the smoother.
4253H is a nonlinear low-pass filter with a sharp drop in its transfer
function between passed and stopped frequencies, little Gibbs rebound, and
excellent resistance to isolated spikes in the data (see the paper).
Briefly, 4253H passes a series of nonlinear and linear filters over the
data:
4 takes a running median of width 4. This is equivalent to taking a 25%
trimmed mean because the high and low values of each quadruple are dropped
and the middle two averaged.
2 averages pairs of values, and realigns the data at its original "time"
points.
5 takes a running median of 5
3 takes a running median of 3
H is a "Hanning" step, a running average weighted .25, .5, .25
These are each done in turn on the result of the previous pass, each pass
resmoothing the result of the previous pass. The smoother should then be
"reroughed" by applying it to the residuals and adding the result back to
the original smooth sequence.
Caveats:
I have not seen Statistica's implementation. If their documentation doesn't
cite my JASA paper, then I don't know where they learned the algorithm, so
I cannot guarantee that they do it right. You can compare their results to
the results generated by Data Desk.
There is a special "end point rule" to improve behavior near the edges of
the sequence. I don't know what Statistica has implemented there either.
There are better resistant nonlinear smoothers avaiable now. 4353H is
almost 20 years old. Look at the Trewess smoother in Data Desk for one
example. An improved version of trewess is scheduled to appear in a new
release of Data Desk soon. There is also work being done at what used to be
called Bell Labs (who knows their name this week?) (sort of like referring
to Prince...) on resistant adaptive smoothers that looks promising, but I
don't know if it is avaialable yet.
-- Paul Velleman
 -----------------------------
Date:    Mon, 13 Jan 1997 02:25:48 GMT
From:    Craig Franck 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
bm373592@muenchen.org (Uenal Mutlu) wrote:
>I think this posting contains useful hints for a successful play
>strategy! Read on!
>
>On Thu, 9 Jan 1997 14:38:13 GMT, nveilleu@NRCan.gc.ca (Normand Veilleux)
>wrote:
>
>>consecutive drawings if you buy 1 ticket per draw.  And even after 168
>>draws, there still is 0.042398 probability of having lost all draws.
>
>That's saying 95.76 % chance of winning (>=3D3D 3) if playing the same 1 t=
ic=3D
ket
>in 168 consecutive draws. IMHO an important conclusion from this would be:
> Playing the same 1 ticket in x consecutive draws is better than playing
> x different tickets (or a wheel) in 1 draw.
>Isn't it?
No. If the odds of winning a game are 1 in 54 million, then if you
buy 54 tickets, your odds of winning are 1 in 1 million. If you buy
1 ticket for 54 drawings, your odds of winning never get above 1 in
54 million. The only way to increase your odds of winning is to buy
more tickets. The only thing that is important about them is that
none of them have all of the same numbers. (Although, if the pot were
split, you would get 2 shares; not really worth considering as a
strategy though.)
>If yes, then the further practical generalization of this statement
>would be:
> Don't change your numbers; ie. play always the same numbers (tickets or
> wheel) until you have a win.
It doesn' matter. However, people who play the same numbers every
week do tend to be more loyal players; they would probably want
to kill themselves if their numbers came in, and they didn't play
them that particular week!
> --> So one should also very well think of analysing the past draws for
>     choosing the 'right' expected numbers (it's normally a one-time task)
>
>I think, that's it! Ie. IMO this is a very important key fact for a
>successfull play strategy! Isn't it?
There are no expected numbers, unless you mean the ones you play
every week and are hoping they will be drawn.
>Sure! Because, we no longer start again from the beginning at each draw.
>Instead we keep it constant since probability says "using the same
>numbers a win should occur in the next x draws..." But, if we change the
>numbers each time then everything starts again from the beginning, so this
>should be strictly avoided!...
>
>What do others think on this strategy?
If you want to increase the odds of winning, buy more tickets.
The game is constructed so that those who play more are more
likely to win. That is the incentive to play. And if you don't
play, the odds of winning are 0. If you bought all 54 million
combinations the odd of winning would be 1 in 1. (If you share
the pot you will win less than the total jackpot, so there is
no guarantee that you will get your 54 million back.) :-)
>>If you do come up with the same number, then it implies that wheeling
>>does not change, in any way, the average number of winning tickets.
>
>But then also the opposite is true: wheeling is at least equal to using
>the same number of any different randomly chosen single tickets. True?
As long as they are for the same drawing. Otherwise, wheeling
works better. But at the same time, saying "give me 168 easy
picks" is just a good. (You may not get a free ticket, but
if that is the only return on 168 bucks, it is not much of a
consolation prize. That, and there is the fact that in 168 easy
picks, two may be the same.)
>Are there any situations where wheeling behaves worser than using
>randomly or even any some otherwise chosen different tickets of
>same size?
If you think you have some psychic abilities, letting numbers
pop into your head may work better then these "covering
combinations" schemes. What is so sad is that even if you use
one of these methods of picking numbers and win the jackpot,
all it proves is that you "got lucky".
--
Craig
clfranck@worldnet.att.net
Manchester, NH
The less a man makes declarative statements, the
less he's apt to look foolish in retrospect.
  -- Quentin Tarantino in "Four Rooms"
 -----------------------------
Date:    Fri, 10 Jan 1997 09:22:30 -0700
From:    Karl Schultz 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
C. K. Lester wrote:
>
> In article <32D53060.B3D@fc.hp.com>, Karl Schultz  wrote:
> >C. K. Lester wrote:
> >>
> >> In response to Karl Schultz's prior post,
> >>
> >> >There are no subsets.  The 168-ticket wheel will guarantee a 3-match
> >> >in a 6/49 lotto.
> >> >
> >> >No, the first statement is correct.
> >> >The "wheeled group" is the entire set of 49 numbers.
> >>
> >> So, with a 6/49, buying 168 tickets using the 168-ticket wheel will
guarantee
> >> a 3-match...
> >>
> >> So what?
> >
> >Because you were asking about this!!!!!
>
> NO NO NO... sheesh almighty. I was referring to the "perceived value" of =
=3D
such
> a scheme... as in, "what value is buying 168 tickets for a guaranteed
> three-match?" Maybe I should have said, "Big deal."
The above representation of your question is much better than the
vague "So what?".
The perceived value, IMHO, is as follows.  People like to win.
If they can be sure to walk away with something, then they
might take steps to do that.  The only way to increase your
chances of winning is to play more numbers.  If you are
in the habit of playing 100+ numbers at a time and have
had a long losing streak, you might be inclined to play
the 168-ticket wheel, so that you are sure to have to make
that trip to the counter to claim a prize.  Actually, you
have a 60+% chance of getting 3 wins with 168 tickets,
but that is another story.  So, it is a psychological
thing - sure to get a win.
In the end, you are right.  Big Deal.
The wheel is just a structured way to buy more
tickets, which, in itself will increase chances.
Now, here is a real tough question for wheel experts.
If one plays 168 tickets using the wheel, they are sure
to match 3 at least once.  What does this wheel do to
one's chances to match more than 3???  There was once
a speculation that playing this wheel will reduce the
chances of matching more than 3 on one ticket.  Any
truth to this?
 -----------------------------
Date:    Fri, 10 Jan 1997 17:17:01 GMT
From:    David Nichols 
Subject: Re: /method subcommand in spss manova
In article <32d297e5.15739961@news.zippo.com>,
Jay Weedon  wrote:
>On 7 Jan 1997 05:24:45 GMT, lthompso@s.psych.uiuc.edu (Laura Thompson)
>wrote:
>
>>
>>
>>What has replaced the /method subcommand in spss manova?  I would like to
>>get sequential SS, but cannot..the default is unique. I do not have acces=
=3D
s
>>to version 7.0 and the glm procedure, and must use spss for this .. what
>>can I do?
>
>I don't know whether I've misunderstood the query, but according to my
>copy of SPSS for Windows Release 6.0, you can still specify
>/method=3D3Dunique or /method=3D3Dsequential in the manova procedure. Uniq=
ue
>is, as you say, the default.
>
>Jay Weedon.
>
Our machines were having problems so I couldn't post when this was
originally posted. I sent an email to the original poster on this.
Jay's answer is correct. There was a change (a simplification of the
METHOD subcommand syntax) for release 5.0. In 4.x and earlier
releases where UNIQUE was the default (it was changed from SEQUENTIAL
to UNIQUE long ago), you would specify:
 /METHOD=3D3DSSTYPE(SEQUENTIAL)
or some version with three letters or more per word. In releases
beginning with 5.0, it's simply:
 /METHOD=3D3DSEQUENTIAL
--
---------------------------------------------------------------------------=
=3D
--
David Nichols             Senior Support Statistician              SPSS, In=
=3D
c.
Phone: (312) 329-3684     Internet:  nichols@spss.com     Fax: (312) 329-36=
=3D
68
---------------------------------------------------------------------------=
=3D
--
 -----------------------------
Date:    Fri, 10 Jan 1997 11:28:29 +0100
From:    "Staff Connection, Inc." 
Subject: SAS SAS SAS longterm contracts in Phoenix, AZ.!!!!!!
SCI Southwest is currently seeking several experienced SAS programmers for
longterm contracts in Phoenix, AZ. Pluses: online
SAS,SCL,SAS/AF, Frame Entry
  Contact us:
Please fax resume ASAP to SCI: Fax: 612-545-3699; Email: sci@mm.com. ;
Voice: 612-545-2228
  Our market: Phoenix,  Las Vegas,  Minneapolis
  Matching Service -(Resume/Jobs)
We have a continuous flow of new contract and job opportunities daily.
Please fax us your resume. You can expect our utmost discretion on your
behalf.  We will inform you of appropriate matches and obtaining your
permission prior to representing you to any companies.
We appreciate the opportunity to servere you.
  More detail about SCI (Staff Connection, Inc,) below: or
URL: http: //www.mm.com/sci/
SCI (Staff Connection, Inc,) a Computer Consulting/Contract Service and
Permanent Job Placement Search Firm, established in 1984.  We represent
positions in the Southwest and Midwest USA regions including:
Minneapolis, Minnesota (MN), Phoenix, Arizona (AZ), and Las Vegas, Nevada
(NV).
Opportunities include areas such as: PC/UNIX/Client Server, Smalltalk, C,
C++, MS Windows/NT/95, Oracle, Sybase,Informix, Peoplesoft, SAP,
Powerbuilder, Foxpro, Delphi, WWW, Internet, JAVA, SAS, Unisys, AS400,
Tandem and some mainframe areas.
Titles include: Systems Architects, Designers, Analyst, Team
Leaders/Project Leader/Managers, Application Programmers, Developers,
Systems, DBA9s, Communications/Networking and System Administrators.
   Midwest:
  Minnesota
SCI (Staff Connection, Inc.)-Mpls
Fax: (612) 545-3699, Email: Internet-sci@mm.com,  Voice: (612) 545-2228
   Southwest:
  Arizona and Nevada
SCI Southwest
Fax-(602) 493-5417; Voice: (602) 493-6688;  Email: sci-sw@amug.org
 -----------------------------
Date:    Thu, 9 Jan 1997 16:52:56 -0800
From:    ANNE KNOX 
Subject: LOWESS regression
Sorry to bombard this newsgroup with regression questions!
I'm looking for statistical programs that can perform locally weighted
(LOWESS) regressions.  Any suggestions?  Also, are there any references
that discuss statistical testing of LOWESS regressions?
Thanks,
- Anne
 -----------------------------
Date:    Thu, 9 Jan 1997 18:25:01 +0000
From:    "A.N.Cutler" 
Subject: Re: Best Design of Experiments software??
I like Minitab (http://minitab.com I think) ot try the Statlib server. It i=
=3D
s
very
Response Surface oriented.
Tony Cutler
9 Oak Lane, WILMLSOW, Cheshire, SK9 6AA, UK
cutler@dial.pipex.com
 -----------------------------
Date:    Tue, 7 Jan 1997 14:10:12 GMT
From:    Bill Rising 
Subject: Re: kappa
In message <2.2.32.19970106023351.00690340@206.64.128.3> - Sun, 5 Jan 1997
21:33:51 -0500William Delaney  writes:
>
>We have tried to use  STATA to derive Kappa values.Intercooled STATA >can'=
=3D
t
handle more than 250 enteries of a 5 choice table to compare >inter-rater
agreement.
I generated a fake data set with 2000 observations from two observers in a =
=3D
5
choice table, and even with little memory, there was no problem. Perhaps I
don't understand what you mean by 'entries'.
[snip...]
Bill Rising
 -----------------------------
Date:    Fri, 10 Jan 1997 14:27:23 -0500
From:    Herman Rubin 
Subject: Re: Regression for Error in Y and X
In article <5b4nll$ns1@wnnews1.netlink.net.nz>,
Vit Drga  wrote:
>Jordi Riu  wrote:
>>>I am looking for information about regression models and techniques for
>>>cases where there is error in the independent X variable as well as the =
=3D
Y.
>(deleted)
>>>Mark Bailey
>>For a review of calibration methods that take into account the errors in
>>both axes try at:
>>Journal of Chemometrics, 9 (1995) 343-362
>>For methods comparison studies taking into account the errors in both
methods:
>>Analytical Chemistry, 68 (1996) 1851-1857
>>Hope it works.
>> ------------------------------------------------------------------
>> Jordi Riu
>The following references address this problem for linear regression.
>Wald, Abraham  (1940) "The fitting of straight lines if both variables
>are subject to error", Annals of Mathematical Statistics, Vol 11,
>pgs 284-300.
>Madansky, Albert  (1959) "The fitting of straight lines when both
>variables are subject to error", Journal of the Statistical
>Association of America, Vol 54, pgs 173-205.
>[Abstract only]  Scott, Elizabeth L. (1947) "Explicit solution of the
>problem of fitting a straight line when both variables are subject to
>error for the case of unequal weights", Annals of Mathematical
>Statistics, Vol 18, pg 456.
>Mandel, John  (1964) "The Statistical Analysis of Experimental Data",
>Interscience Publishers (Wiley & Sons), New York.
>Mandel (1964) pgs 288-292 has a small section on the problem including
>a worked example.
>Are there any other references on the topic??
There are many more.  Without making such assumptions as the ratio of
variances, the normal case is unidentified.  Reiersol has the first
general paper on when identifcation can be made; this appeared in
Econometrica, v. 16, 1950, pp. 375-389.  Reiersol has references to
even older papers.  Neyman and Scott published consistent estimates
under a fairly general model in Annals of Mathematical Statistics v.
22, pp. 352-361, with a correction in v. 23, p. 135,  There ia a lot in
the unpublished 1952 paper of T. A. Jeeves; I believe that this is his
University of California dissertation entitled, "Identifiability and
almsot sure estimability of linear structure in n dimensions."  I have
a paper in Annals of Mathematical Statistics v. 29 which constructs
consistent estimates under an unusual identification condition
suggested by Jeeves; I doubt that it will interest many of you.
--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-139=
=3D
9
hrubin@stat.purdue.edu         Phone: (317)494-6054   FAX: (317)494-0558
 -----------------------------
Date:    Fri, 10 Jan 1997 21:24:56 GMT
From:    Uenal Mutlu 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
On Thu, 09 Jan 1997 17:25:10 -0700, Karl Schultz  wrote:
>2) Generates 54 drawings to represent the player playing 54 tickets.
I'm not sure about this point, because in my posting I had asked
Mr. Sharkey about the interchangeability of these two things, ie.
>>I would say that the probability for both the below events is equal:
>> - playing the same 1 ticket in 54 consecutive draws
>> - playing 54 different tickets in 1 drawing
>>
>>Isn't it?
>
>No, it's not.  Not quite.
He gave also detailed examples (thanks), but since I can't play cards :-)
I have a hard imagination about them, so especially such examples with
cards confuse me more than they help me (I know it is my problem, but
if possible, I would like to see another example without cards etc.;
pure math and/or a formula would be ok.)
Another confusing thing (for me) is also the difference (if any)
between 'probability' and 'chance', for example in the following
excerpt of a posting of again Mr. Sharkey:
>If odds of AT LEAST 3 is 1/54 then chance of getting a 3 match  in 54 is
>    1 - [ (1- 1/54)^54]  or about  0.6355
(here, it's meant in 54 consecutive drawings (not 54 tickets in 1 drawing),
if I got it right).
But, what does this exactly mean? Here, we are using 1 ticket. Which of
the below is true:
 - 63.55 % chance of winning "at least 3" in the next drawing
 - 63.55 % chance of winning "at least 3" after participating in 54
   consecutive drawings
- Does the above apply only if the same 1 ticket (ie. fixed nbrs) is used?
- Does it matter (for the above formula) if in every drawing different
  (ie. randomly chosen) numbers are played (on 1 ticket).
If one plays in 54 consecutive drawings using the same 1 ticket:
 - is then the probability for   "at least 3" constant or does it change?
 - is then the chance of winning "at least 3" constant or does it change?
(Sorry if I used any wrong terms (ie. probability vs. chance etc.), but
IMHO should be clear what it's meant)
 -----------------------------
Date:    Mon, 13 Jan 1997 10:34:04 +0000
From:    Ronan Conroy 
Subject: Re: Comments needed on using databases for data entry
Ken Reed writes:
>It's more a question of efficiency than taste. Using the coding above
>increases:
>1) Data entry costs.
you pay for quality
>2) Disk space
costs nothing nowadays
>3) Error, eg typing unkown instead of unknown
not allowed by any half-way reasonable database. I use epi-info, for
example, to ensure that the names of the locations where we work in
Africa are spelled consistently. The operator keys in N and epi-info
enters Nyonyorrie (and not Inyonyorrie, Nyonyorie or several other
legitimate variants [the Maasai themselves argue about the correct
name!]).
I have a poor researcher at the moment phoning the US to try to find the
person who keyed in the data for a study and then left the country
without leaving us any comprehensive notes on the coding scheme he used.
I know this is bad practice - my point is that inventing numeric coding
encourages this sort of muddle.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
    _/_/_/      _/_/     _/_/_/     _/     Ronan M Conroy
   _/    _/   _/   _/  _/          _/      Lecturer in Biostatistics
  _/_/_/    _/          _/_/_/    _/       Royal College of Surgeons
 _/   _/     _/              _/  _/        Dublin 2, Ireland
_/     _/     _/_/     _/_/_/   _/         voice +353 1 402 2431
                                           fax   +353 1 402 2329
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Get a life, they said, but I couldn't find the ftp site...
 -----------------------------
Date:    Fri, 10 Jan 1997 22:13:08 GMT
From:    Richard F Ulrich 
Subject: Re: hypergeometric distributions: hyp. tests and inference?
I do like the answer of Ellen Hertz, but I will add just a bit, at the
end..
Ellen Hertz (eshertz@access.digex.net) wrote:
: Steve Cumming wrote:
: >
: > From a large (n \approx 100) equal size plots, I have count data for
: > two tree species A and B. Other species are present, but in very low
: > abundances---I plan to ignore them. I want to test for association
: > between A and B: do they usually occur together, or apart?
: >
: > Scatter plots of the data show that points are restricted
: > (approximately) to the region x+y \le n_{max}.  This is reasonable, as
: > there is a maximum density of trees one would expect.  This makes me
: > think that an appropriate null model is the hypergeometric
: > distribution.
: >
: > None of my usual sources show me how to do hypothesis testing under
: > this distribution. Also, it seems to me that various alternate hyp's
: > could be formulated in terms of a dispersion parameter of some kind:
: > over-dispersion implying that the species are segregated, and
: > under-disperson implying correlation in abundance. In other words, a
: > more general distribution is needed.
: >
: > Can anyone say if I am on the right track with this, and point me at a
: > good reference or two? Or source code?
: >
: > --
: > Steve Cumming                               "I could save the world
: > Research Scientist (NCE/SFM)                 if I could only get the
parts."
: > Center for Applied Conservation Biology     Honni soit qui mal y pense.
: > stevec@geog.ubc.ca
: You might test the independence of "plot" and "type tree" where plot
: has 100 levels and type tree has 2. If there are at least 5 of
: each type in each plot, this could be done with a chi-square test;
: if not you could use Fisher's exact test.
: Ellen Hertz
It seemed like there ought to be something simpler, but after
puzzling it out on my own, I finally came up with exactly what Ellen
had already suggested.
 -- detail, the warning is to have an "Expectation" of 5 for each
cell of the table, to keep the chisquared statistic robust.  What
to do for smaller plots, if the Exp[f(x+y)] is not that much? -
probably, for this analysis, drop those plots from the analysis.
(And "Fisher's test" actually is the 2x2 version of the multinomial,
but I guess everyone got the picture, "exact test for fixed margins".)
 -- It seems to me that there might be a different 'thing'  happening
when x+y  is near the max, than when it is far less.  If they really
are co-dependent, the proportions might be even more regular than
one would expect by chance.  Or, they could be more regular in the
smaller plots  - if they are paired up in small plots, whereas one of
x  might serve to help several of y  when they exist at higher density.
So, you might want to notice what happens if the hundred plots are
analyzed as two sets of 50, looking at the larger sums  (x+y)
separately from the smaller sums.
Rich Ulrich, biostatistician                wpilib+@pitt.edu
http://www.pitt.edu/~wpilib/index.html   Univ. of Pittsburgh
 -----------------------------
Date:    Mon, 13 Jan 1997 14:44:21 TUR
From:    Hulya Atil 
Subject: subscribe
I would like to subscribe your list
Thanks...
Dr. Hulya Atil
 -----------------------------
Date:    Sat, 11 Jan 1997 01:26:38 GMT
From:    Barry DeCicco 
Subject: Re: Best Design of Experiments software??
In article <32D537FD.2C55@dial.pipex.com>, "A.N.Cutler" 
writes:
|> I like Minitab (http://minitab.com I think) ot try the Statlib server. I=
=3D
t is
very
|> Response Surface oriented.
|>
|> Tony Cutler
|> 9 Oak Lane, WILMLSOW, Cheshire, SK9 6AA, UK
|> cutler@dial.pipex.com
|>
|>
And the learning process is much nicer.  I have shown
Minitab to at least five people recently (two engineers,
and three high school students), and they all took to
it readily.
Barry
p
 -----------------------------
Date:    Fri, 10 Jan 1997 01:26:44 GMT
From:    Xie Min 
Subject: An internet conference on Quality
Please take a look at this and post your comments!
http://www.mcb.co.uk/services/conferen/jan97/tqm/tqasia/theme13.htm
--------- an introduction ---------
The Asia-Pacific has commanded the economic headlines for
the past decade. It is now well recognised that Asia-Pacific
productivity levels have risen to world-beating standards.
Asian prosperity has brought indigenous technological
advance and high skill work forces contributing to greater
confidence. To sustain this success, organisations are
shifting from improving the efficiency of investments and
reducing costs to focusing on adding value through creativity
and innovation. This conference will address the influence of
the management of quality to this undoubted success.
The importance of the role of Medium Size Enterprises to a
nation's prosperity cannot be underestimated. It is a common
thread throughout the themes appearing at this conference.
Nevertheless, Dr. Hesan Quazi from Nanyang Technological
University (Singapore) provides a platform to discuss SMEs
from an Asian perspective , whereas Lawrie Corbett from
Victoria University Wellington (New Zealand), takes a more
holistic view of SMEs quest for quality.
Asia-Pacific is not just renowned for its productivity in
manufacturing, but also in services. In this conference, the
focus is on continuous improvement in regulated and
hospitality industries . Professor Barrie Dale, United
Utilities Professor of Quality Management at the University
of Manchester Institute of Science and Technology, United
Kingdom (UK), and his colleague invite deliberations on
continuous improvement in the former; Professor Richard
Teare, Charles Forte Professor of Hotel Management from
the University of Surrey, (UK), and his colleague will
facilitate discussions in the latter.
The prosperity enjoyed by the nations in this region today
owes a large debt to its manufacturing base. Professor David
Hamblin , Professor of Manufacturing Management from the
University of Luton (UK) and his associate, will facilitate
discussions in this arena; covering mature industries like
clothing manufacture, to success stories in the automotive
and electronics to future sectors like aerospace. In addition,
members of the Manufacturing Systems Research Group of
the University of Cambridge (UK) will explore the
implications of transferring of manufacturing technologies
right first time.
Management of change has become an essential challenge of
the economic miracle which the countries in this region are
enjoying, and the byword is culture. Professor Ross
Chapman from the University of Western Sydney (Australia)
and his associates will explore the culture factor -
organisational and national - in this equation. Managing
change is through managing people. Dr. Simon Lam and his
colleague from the University of Hong Kong (Hong Kong)
explore the evolution of the role of human resources in total
quality.
"Role models" - quality awards, ISO 9000 series
certification and environmental considerations are increasing
in importance amongst organisations in the Asia Pacific
region with the globalisation of competition and affluence of
society. Professor Luis Calingo of California State
University, Fresno (USA) stimulates discussions on the
significance and influence of such awards on the
management of quality. Dr. Tony Moody of the University
of Portsmouth (UK) will facilitate discussions on the
ISO9000 series, and the consequences of this commitment.
Ruth Hillary of Imperial College (UK), invites discussions
on the latter, with particular focus on total quality in the
context of environmental issues - ISO14001, EMAS,
Eco-Audits - and its significance in organisations and
society.
Dr. Mark Goh of the National University of Singapore
(Singapore) navigates the future by exploring total quality in
innovation management as the next approach to higher
productivity; whilst his colleague, Dr. Min Xie explores the
practicalities of statistical methods in quality.
Finally, befitting perhaps of this type of conference, Dr
Richard Barson of the University of Nottingham (UK) will
meditate deliberations on the management of quality in the
development of information technology (which perhaps in
part enabled the possibilities of conducting such
conferences?).
Thank you, and I hope you will find this an added value
experience.
Christopher Seow
Convenor of Internet Conference
Total Quality in the Asia Pacific
 -----------------------------
Date:    Fri, 10 Jan 1997 22:17:24 GMT
From:    Steve Roberts 
Subject: TRatios in PROC ARIMA
How does one get probability values associated with the TRatios that
are output from the ESTIMATE statement of PROC ARIMA?
 -----------------------------
Date:    Fri, 10 Jan 1997 23:16:01 GMT
From:    Tim B Heyland 
Subject: Re: testing diff between partial reg coeffs in different samples
I don't see this as any different than a two sample test for the
difference between two means.  The slope parameters from your regression
model are means and since the two samples were collected separately (I'm
assuming), then there won't be any covariance b/n the two estimates.
This last point relates to the calculation of the variance (or SE) of the
difference b/n two means, which in general is
Var{b1-b2} =3D3D Var(b1) + Var(b2) + 2*Cov(b1,b2)
In this case, the covariance term =3D3D 0 so the formula that someone else
wrote in an earlier reply is the correct one.
Another point regarding the independence/interaction is worth mentioning.
It is impossible for there to be any interaction b/n the male and female
sample.  Interaction is something that happens b/n two
covariates or factors not samples or populations.  Samples however, can be
dependent or independent.
Tim Heyland
Toronto, Ont.
 -----------------------------
Date:    Mon, 13 Jan 1997 17:47:00 +0100
From:    Jordi Riu 
Subject: Joint confidence interval-degrees of freedom
Hello,
Let's suppose I have a multivariate model, y=3D3Db0+b1x1+b2x2. I can buid t=
he
joint confidence interval for the b0, b1 and b2 coefficients as the joint
confidence ellipsoid, which follows an F test with alpha level and 3 and n-=
=3D
3
degrees of freedom. Now I take the former model, and I combine b1 and b2 in
the following way: c=3D3Db1+b2 in order to rewrite the model: y=3D3Db0+cx'.=
 My
problem concerns to the degrees of freedom of the joint confidence interval
of the new model. Are they 2 and n-2 or 3 and n-3 (or something else)?
Thanks in advance.
 ------------------------------------------------------------------
 Jordi Riu                            tel.:   34-(9)77-558187
 Departament de Quimica               fax.:   34-(9)77-559563
 Univ. Rovira i Virgili de Tarragona  e-mail: rusell@quimica.urv.es
 Pl. Imperial Tarraco, 1
 43005-Tarragona    Catalonia - Spain
 -----------------------------
Date:    Mon, 13 Jan 1997 09:01:18 -0800
From:    "SAS & OS/2: THE PERFECT TEAM." 
Subject: Re: AOV table in QPRO
Hi!
I *assume* that by QPRO you are referring to QuattroPro, a spreadsheet
package (please do supply fuller details, e.g. version etc, in the future).
If so, you may or may not know that it assumes that every design is
balanced and that it auto matically inserts 0's for "missing data".
Consequently, the SS and DF are wrong.
This is one of the reasons some statisticians have urgued, often on this
forum, against the use of spreadsheets for statistical analysis. It is
(almost) the equivalent of using a tablespoon to move dirt in your
backyard: it just isn't the right tool for the job.
james ssemakula
teamos2
=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D=3D3D
Date:         Tue, 7 Jan 1997 17:15:05 -0800
From:         Dr Horst Kaiser 
Subject:      AOV table in QPRO
X-To:         STAT-L@VM1.MCGILL.CA
Dear group members:
Below is a sample data set, I subjected to Analysis of Variance.  Since
I was curious to see how the Windows Version of QPRO handles the task, I
ran the test as suggested in the instructions.  The relevant section of
the result indicated that the AOV table is wrong.  The df-values are
incorrect and so is the F-value.  I could have come to the innocent
conclusion that the groups do not differ from each, or that I should
accept the null hypothesis.   The second AOV table shows the correct
result (as taken from Zar 1981).
It seems quite remarkable that the developers would have gotten this
wrong.  Most likely the programmers did not take "missing values" into
account and I guess that they were taken as 0.  This can be tested by
running the same data set with the previous "missing value" replaced by
0.  It would yield the AOV table below (although the variances are
unequal and a normality test would have failed...).
I wonder who has had a similar experience, and if there is more of those
faults in the programme.  Although this was a very easy example and it
was clear from the df-values that there was a mistake, the question is
if we should have to test any results before we trust them?
I have sent a note to the developers but have not yet received a
response.
Sample data set:
A       B       C       D
60.8    68.7    102.6   87.9
57      67.7    102.1   84.2
65      74      100.2   83.1
58.6    66.3    96.5    85.7
61.7    69.8            90.3
Analysis of Variance as shown by the QPRO 6.02 output
                SS              df      MS              F
Between Groups  1951.61         3       650.53667       1.271
Within Groups   8184.448        16      511.528
Total           10136.058       19
Analysis of Variance table (from Zar 1981)
                SS              df      MS              F
Between Groups  4226.35         3       1408.78         165
Within Groups   128.35          15      8.557
Total           4354.70         18
 -----------------------------
Date:    Mon, 13 Jan 1997 09:26:42 -0600
From:    Clay Helberg 
Subject: Re: Multi-Dimensional Scaling
Central Inst for the Deaf wrote:
>
> Hi,
>
> I am currently searching for a software package that does a good job
> of multi-dimensional scaling. It will be used primarily for visualization
> of relatively large data sets.
>
> What features should I look for in the software package, and does anyone
> have suggestions for a particular package?
>
> Thanks,
> Don
Of course I am biased (see .sig below), but I think SPSS has a fine MDS
module (it is part of the Professional Statistics add-on package). It
can handle classical metric and nonmetric MDS, replicated MDS (metric
and nonmetric), and weighted MDS (individual differences scaling, or
INDSCAL). For more information, see .
BTW, a good reference for MDS is Young & Hamer (1987), _Multidimensional
Scaling: History, theory, and applications_. (Lawrence Erlbaum)
                                                --Clay
--
Clay Helberg         | Internet: helberg@execpc.com
Publications Dept.   | WWW: http://www.execpc.com/~helberg/
SPSS, Inc.           | Speaking only for myself....
 -----------------------------
Date:    Mon, 13 Jan 1997 11:01:56 -0800
From:    Chris Barker 415-852-3152 
Subject: Parameter Estimation by Sequential Testing
regarding PEST.
PEST is written by John Whitehead and Hazel Brunier. It stands for
Planning and Evaluation of Sequential Trials.
It provides the ability to design
and analyze a wide variety of statistical studies, where the conduct
of the study involves sequential analysis of the outcomes.
EAST also permits sequential analysis of studies, for example Lan-DeMets
procedures and``1 "alpha spending functions".
Aside from the application to sequential trials, the underlying mathematics
differs.  I understand, but don't have citations, that Whitehead and others
have recently prepared some papers to explain the connections between the p=
=3D
est
and east approaches.
 The email is MPS@READING.AC.UK
Tel: +44-1734-318027
However, you should contact Whitehead directly for details about the projec=
=3D
t.
>Date:    Thu, 9 Jan 1997 19:07:37 -0500 From:    Pralay Senchaudhuri
 Subject: Re: Parameter Estimation by Sequential Testing
>
>Lenora Norma Brown wrote:
>> > If anyone is familiar with the scientific statistical programme > PEST
(Parameter Estimation by Sequential Testing) I'd be happy to > hear from yo=
=3D
u.
It is a program that allows one to determine a > threshold in as few trials=
=3D
 as
possible. > > thanks > -- > Lenora N. Brown > Educational Psychology/
Psychology > University of Calgary > 2500 University Drive N.W. > Calgary,
Alberta > T2N 1N4 > > E-MAIL: lnbrown@acs.ucalgary.ca; Phone: 403-220-4667
>
> I have heard about PEST. But I have not used it. Cytel Software has  a
product EAST for determining the EArly STopping criteria. This is similar t=
=3D
o
PEST. To know more about this product try the web site www.cytel.com.
 -----------------------------
Date:    Thu, 9 Jan 1997 06:17:22 -0700
From:    Thomas Peters 
Subject: Re: Regression question: why R^2 =3D3D r^2?
Kai O'Yang  writes:
> Hi,
>
> Could someone explain to me why the square of coeff of determination equa=
=3D
ls
> to square of correlation coeff? Also, what are the assumptions of this
> equality?
>
> TIA,
> Kai
>
> --
>
> Kai O'Yang, PSCIT, Monash Uni, Australia. email: Kai.Oyang@monash.edu.au
> URL: http://pscit-www.fcit.monash.edu.au/~oyang     Tel: +61 3 9904 4615
> ------------------------------------------------------------------------
This is only true for a simple two variable linear regression equation. ie.
        y =3D3D a + bX + u where u is i.i.d.
See any introductory econometrics text.
--
Thomas Peters, Senior Consultant  IMS Canada  Thomas_Peters@imsi.ab.ca
 -----------------------------
Date:    Mon, 13 Jan 1997 16:32:31 GMT
From:    Richard F Ulrich 
Subject: Re: kappa
Chauncey Parker ("chaunce\"removethis\""@u.washington.edu) wrote:
: Richard F Ulrich wrote:
  << deleted >>
: snip . . .
: I'm not sure that I follow your rating task.  Seems that it might be an
: interval type measurement.  If so, Pearson's or t-test don't seem like
: they would express agreement and association as would an ICC.
: If your measure is interval like, ICC is the interrater reliabability
: stat to use; I would think.  but alas, I'm still quite a naive student.
There seems to me that there must be some blindness in the way that
"reliability"  is being taught, because the point that I was making is
a simple one...  yet, this is not the first time that it has been
missed.
The intra-class correlation (ICC)  is a fine measurement for
publishing what you have achieved in "reliability".  Unfortunately,
it does nothing to illustrate or test or separate out the
 *systematic differences*  that may occur between raters - they
just serve to lower the correlation slightly, since the ICC makes
the assumption that the raters have equal means.
In almost any kind of work that I can think of, it *ought*  to be a
concern if one rater or rating is systematically higher than
another.  The powerful way to test this is with the paired t-test;
the concommitant statistic to the paired t-test is the Pearson
correlation  -  together they give both aspects of comparing the
ratings, SIMILARITY and DIFFERENCE.
So, the ICC may be what editors want to see, and it is okay as
a one-number summary, but anyone examining their own reliability
data has little excuse (IMHO) not to look at tests of difference,
where they are appropriate.
Rich Ulrich, biostatistician                wpilib+@pitt.edu
http://www.pitt.edu/~wpilib/index.html   Univ. of Pittsburgh
 -----------------------------
Date:    Sat, 11 Jan 1997 07:44:26 -0800
From:    "Jeffrey R. Van Kirk" 
Subject: Power Analysis
Does anyone know of a program to do power analyses for various
statistical tests?  Should iclude ANOVA F-Tests as well as others.
Jeff Van Kirk
Deptartment of Psychiatry
University of Connecticut Health Center
Farminton, CT
vankirk@psychiatry.uchc.edu
 -----------------------------
Date:    Mon, 13 Jan 1997 10:54:52 -0700
From:    Karl Schultz 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
Uenal Mutlu wrote:
>
> On Fri, 10 Jan 1997 09:22:30 -0700, Karl Schultz  wrote:
>
> >C. K. Lester wrote:
> >>
> >> In article <32D53060.B3D@fc.hp.com>, Karl Schultz  wrot=
=3D
e:
> >> >C. K. Lester wrote:
> >> >>
> >> >> In response to Karl Schultz's prior post,
> >> >>
> >> >> >There are no subsets.  The 168-ticket wheel will guarantee a 3-mat=
=3D
ch
> >> >> >in a 6/49 lotto.
> >> >> >
> ...
> >The perceived value, IMHO, is as follows.  People like to win.
> >If they can be sure to walk away with something, then they
> >might take steps to do that.  The only way to increase your
> >chances of winning is to play more numbers.  If you are
> >in the habit of playing 100+ numbers at a time and have
> >had a long losing streak, you might be inclined to play
> >the 168-ticket wheel, so that you are sure to have to make
> >that trip to the counter to claim a prize.  Actually, you
> >have a 60+% chance of getting 3 wins with 168 tickets,
> >but that is another story.  So, it is a psychological
> >thing - sure to get a win.
>
> I get a different percent value:
It is "perceived" value.  That intangible value/advantage
that people THINK they are getting from a wheel.
I was discussing psychology more than math.
>
>   Since the probability for at least 3 is about 1 in 54 (to be exact
>   p=3D3D0.0186375450020), using the usual formula gives:
>
>     E =3D3D "at least once at least 3 matching nbrs in 168 consecutive
>          draws using the same fixed 1 ticket"
>
>     p(E) =3D3D 1 - (1 - 0.018637545)^168 =3D3D 0.9576
>
>     meaning 95.76 % of chance of occuring of the event E.
>
This is right if you pick the 168 tickets randomly.
If you use the wheel, you get 100%.
I am not a wheel expert, but if you pick 168 tickets randomly,
you are going to have some number of duplicate coverage for 3-matches.
All the wheel does is
reduce the number of redundant 3-match coverages and makes sure that
all possible 6 number combinations result in a 3-match.
There is some redundancy left over.
Here is the output of Richard Lloyd's wheel checker.
Overall 3-match combination coverage summary:
Covered 10 times :     1
Covered  6 times :     3
Covered  5 times :     6
Covered  4 times :    59
Covered  3 times :    60
Covered   twice  :    52
Covered   once   :  2782
Covered   never  : 15461
Total:      2963 / 18424 (16.1%)
    Combination      Covered  Elapsed   Speed   To Go
 1 12 13 14 15 16    1210362    0:03   403454    0:31
 2  3  4  5  6  7    1712304    0:03   570768    0:21
[Reporting switched off during the final 30 seconds]
     Finished       13983816    0:26   537839    0:00
Losing  combinations:        0 (  0.0%)
Winning combinations: 13983816 (100.0%)
Total   combinations: 13983816 (for a 6 from 49 lottery)
> (BUT: I'm still not sure if this is the same as playing 168 different
> tickets in 1 drawing)
(different issue - start a new thread?)
 -----------------------------
Date:    Mon, 13 Jan 1997 15:44:50 -0500
From:    Bill Parr 
Subject: Response Surface Methods Conference! June 19 - 21, 1997
Interested in a research conference which is 50% focused on Response
Surface Methods, the other 50% on a variety of topics -- Statistical
Education, Neural Nets, Statistical Modeling?
Speakers to include: George E. P. Box, Ray Myers, John Cornell, Dennis Lin,
Richard Scheaffer...
The conference will be held in Gatlinburg, Tennessee (in the Smoky
Mountains National Park area) June 19 - 21, 1997.
For more details, check it out at:
http://funnelweb.utcc.utk.edu/~wparr/srcos.html
or by sending an email asking for information to wparr@utk.edu.
Bill Parr
--
William C. Parr
wparr@utk.edu
URL: http://funnelweb.utcc.utk.edu/~wparr/
Phone: 423-974-1631     Fax: 423-974-2490
 -----------------------------
Date:    Mon, 13 Jan 1997 17:47:07 GMT
From:    Normand Veilleux 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
>From: bm373592@muenchen.org (Uenal Mutlu)
>I think this posting contains useful hints for a successful play
>strategy! Read on!
>
>On Thu, 9 Jan 1997 14:38:13 GMT, nveilleu@NRCan.gc.ca (Normand Veilleux)
>wrote:
>>consecutive drawings if you buy 1 ticket per draw.  And even after 168
>>draws, there still is 0.042398 probability of having lost all draws.
>
>That's saying 95.76 % chance of winning (>=3D3D 3) if playing the same 1 t=
ic=3D
ket
>in 168 consecutive draws. IMHO an important conclusion from this would be:
> Playing the same 1 ticket in x consecutive draws is better than playing
> x different tickets (or a wheel) in 1 draw.
>Isn't it?
Nope, it's the inverse.  Which is highest, 95.76% for 168 draws of 1
ticket each or 100% for 168 tickets in one draw?
You can also see this with a very small lottery.  Say a certain
lottery has only 2 possible combinations and only 1 combination
wins a prize (like a coin flip).  Obviously, if you buy all 2
combinations in one draw, then you are guaranteed to win.  But if
you buy 1 combination for 2 consecutive draws, you only have 75%
chance of winning at least one of the prizes.  The big difference
between the 2, like has been mentioned already, is that in the
second case you have the opportunity to win the "jackpot" twice,
but only once in the first case.
The two factors, when taken together result in exactly the same
average expected return.
>If yes, then the further practical generalization of this statement
>would be:
> Don't change your numbers; ie. play always the same numbers (tickets or
> wheel) until you have a win.
> --> So one should also very well think of analysing the past draws for
>     choosing the 'right' expected numbers (it's normally a one-time task)
Looks like you missed something.  The probabilities do not care which
ticket you buy, so you can keep the same ticket all the time or keep
it for some time and then change it or even change it for every drawing.
The probabilities remain the same because every ticket has 260,624
ways of winning out of 13,983,816 combinations.
>>If you do come up with the same number, then it implies that wheeling
>>does not change, in any way, the average number of winning tickets.
>
>But then also the opposite is true: wheeling is at least equal to using
>the same number of any different randomly chosen single tickets. True?
>
>Are there any situations where wheeling behaves worser than using
>randomly or even any some otherwise chosen different tickets of
>same size?
Maybe it's just because you are new at this Uenal, but this has been
stated a million times.  If the lottery is random, then every imaginable
system will have the exact same average expected gain as would a system
based on randomly selecting numbers.  Of course, it will take a lot of
draws to prove it to deluded system authors, but that's what the math
predicts:  all systems will be equivalent within statistical significance.
 -----------------------------
Date:    Mon, 13 Jan 1997 20:44:31 +0000
From:    ctmq 
Subject: Re: An internet conference on Quality
In article <5b45sk$o62@nuscc.nus.sg>, Xie Min 
writes
>Please take a look at this and post your comments!
>
>http://www.mcb.co.uk/services/conferen/jan97/tqm/tqasia/theme13.htm
>
>--------- an introduction ---------
>
>
>The Asia-Pacific has commanded the economic headlines for
>the past decade. It is now well recognised that Asia-Pacific
>productivity levels have risen to world-beating standards.
>Asian prosperity has brought indigenous technological
>advance and high skill work forces contributing to greater
>confidence. To sustain this success, organisations are
>shifting from improving the efficiency of investments and
>reducing costs to focusing on adding value through creativity
>and innovation. This conference will address the influence of
>the management of quality to this undoubted success.
>

>Finally, befitting perhaps of this type of conference, Dr
>Richard Barson of the University of Nottingham (UK) will
>meditate deliberations on the management of quality in the
>development of information technology (which perhaps in
>part enabled the possibilities of conducting such
>conferences?).
>
>Thank you, and I hope you will find this an added value
>experience.
>
>Christopher Seow
>
>Convenor of Internet Conference
>Total Quality in the Asia Pacific
>
Hi Christopher
With your permission.
I noticed the strong emphasis on *quality management* and the strong
overview of the trends and standards as analysed by academic institution
representatives.
Although I have nothing against academic research etc. - much will be
lost in not knowing the vision, application and strategy of
key players in the economic reality of businesses - 2050.
In addition, there seems little interest in *Total Management Quality*
which is paramount to the foundation and maintenance of acceptable
standards for total quality management.  There is a real difference.
Regards
Harry
Total Management Quality Group
Centre for Total Management Quality
--
ctmq
 -----------------------------
Date:    Mon, 13 Jan 1997 11:05:00 -0700
From:    "David L. Turner" 
Subject: gen'zd lm multiple comparisons
I have a client who is interested performing multiple comparisons in a
generalized linear model (specidifcally Poisson and/or logistic
regression).  I am able to "trick" SAS into doing some simple linear
combinations, but this is clumsy to say the least.  Any other ideas?
Please respond directly as well as a general post as my news reader
doesn't keep articles very long.
Thank you for the help.
 -----------------------------
Date:    Sat, 11 Jan 1997 15:49:04 GMT
From:    Uenal Mutlu 
Subject: Re: Probability and Wheels: Connections and Closing the Gap
(This is a resend due to a bug in one of the newsgroup-names, sorry if dupe=
=3D
)
On Thu, 09 Jan 1997 17:25:10 -0700, Karl Schultz  wrote:
>2) Generates 54 drawings to represent the player playing 54 tickets.
I'm not sure about this point, because in my posting I had asked
Mr. Sharkey about the interchangeability of these two things, ie.
>>I would say that the probability for both the below events is equal:
>> - playing the same 1 ticket in 54 consecutive draws
>> - playing 54 different tickets in 1 drawing
>>
>>Isn't it?
>
>No, it's not.  Not quite.
He gave also detailed examples (thanks), but since I can't play cards :-)
I have a hard imagination about them, so especially such examples with
cards confuse me more than they help me (I know it is my problem, but
if possible, I would like to see another example without cards etc.;
pure math and/or a formula would be ok.)
Another confusing thing (for me) is also the difference (if any)
between 'probability' and 'chance', for example in the following
excerpt of a posting of again Mr. Sharkey:
>If odds of AT LEAST 3 is 1/54 then chance of getting a 3 match  in 54 is
>    1 - [ (1- 1/54)^54]  or about  0.6355
(here, it's meant in 54 consecutive drawings (not 54 tickets in 1 drawing),
if I got it right).
But, what does this exactly mean? Here, we are using 1 ticket. Which of
the below is true:
 - 63.55 % chance of winning "at least 3" in the next drawing
 - 63.55 % chance of winning "at least 3" after participating in 54
   consecutive drawings
- Does the above apply only if the same 1 ticket (ie. fixed nbrs) is used?
- Does it matter (for the above formula) if in every drawing different
  (ie. randomly chosen) numbers are played (on 1 ticket).
If one plays in 54 consecutive drawings using the same 1 ticket:
 - is then the probability for   "at least 3" constant or does it change?
 - is then the chance of winning "at least 3" constant or does it change?
(Sorry if I used any wrong terms (ie. probability vs. chance etc.), but
IMHO should be clear what it's meant)
 -----------------------------
Date:    Mon, 13 Jan 1997 23:41:17 GMT
From:    "Bruce L. Lambert, Ph.D." 
Subject: Power for repeated measures designs
Hi everyone,
I am planning some memory experiments that will use repeated measures
(a.k.a., within subjects) designs. However, I do not know how to estimate
sample sizes for these designs. Cohen's text has no index entry for
"repeated measures" or "within subjects" and the chapter on ANOVA does
not give much guidance on the issue as far as I can tell. Most published
studies I've read in reviewing the short-term memory literature use quite
small sample sizes (< 40), but each participant produces many dozens,
even hundreds of data points, due to repeated trials. Can someone give me
pointers in this area?
Thanks.
Bruce L. Lambert, Ph.D.
Department of Pharmacy Administration
University of Illinois at Chicago
lambertb@uic.edu
http://ludwig.pmad.uic.edu/~bruce/
Phone: +1 (312) 996-2411
Fax:   +1 (312) 996-0868
 -----------------------------
End of STAT-L Digest - 12 Jan 1997 to 13 Jan 1997
*************************************************
------------------------------
Date:    Sat, 11 Jan 1997 08:53:41 -0800
From:    "Kenneth M. Lin" 
Subject: The New Palgrave's Time Series and Statistics Book FS
For Sale:
        The New Palgrave: Time Series and Statistics
        edited by John Eatwell, Murray Milgate, and Peter Newman
        Brand new hardcover, never read
        Asking Price $35.00 (including shipping and handling)
If anyone is interested, drop me a note.
Ken
------------------------------
Date:    Mon, 13 Jan 1997 08:14:29 +0100
From:    David Rothman 
Subject: Re: LOWESS regression
ANNE KNOX wrote:
>
> Sorry to bombard this newsgroup with regression questions!
>
> I'm looking for statistical programs that can perform locally weighted
> (LOWESS) regressions.  Any suggestions?  Also, are there any references
> that discuss statistical testing of LOWESS regressions?
>
> Thanks,
>
> - Anne
look at the bell labs site.  there are papers and code by cleveland.
------------------------------
Date:    Tue, 14 Jan 1997 09:38:32 +0000
From:    Ronan Conroy 
Subject: Re: LOWESS regression
ANNE KNOX wrote:
> Sorry to bombard this newsgroup with regression questions!
>
> I'm looking for statistical programs that can perform locally weighted
> (LOWESS) regressions.  Any suggestions?  Also, are there any references
> that discuss statistical testing of LOWESS regressions?
>
Lowess smoothing is available in Data Desk and Stata (under the ksm
command).
Smoothing is a way of filtering your data to look for signal within the
noise. It starts from the opposite position to classical regression in
which the form of the function is specified in advance and the parameters
are estimated from the data. For that reason it does not test a specified
model and therefore isn't a hypothesis test.
A smoother is very important, however, when you are fitting a model. It
allows you to see if, for example, the variables really seem to take on a
linear relationship throughout their range. It is useful for detecting
phenomena which might otherwise go unnoticed, such as a threshold effect
where a relationship abruptly changes direction or magnitude. Using a
smoother is a useful check that your model isn't a serious
misrepresentation of your data.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
    _/_/_/      _/_/     _/_/_/     _/     Ronan M Conroy
   _/    _/   _/   _/  _/          _/      Lecturer in Biostatistics
  _/_/_/    _/          _/_/_/    _/       Royal College of Surgeons
 _/   _/     _/              _/  _/        Dublin 2, Ireland
_/     _/     _/_/     _/_/_/   _/         voice +353 1 402 2431
                                           fax   +353 1 402 2329
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Get a life, they said, but I couldn't find the ftp site...
------------------------------
Date:    Sat, 11 Jan 1997 14:23:04 -0600
From:    Central Inst for the Deaf 
Subject: Multi-Dimensional Scaling
Hi,
I am currently searching for a software package that does a good job
of multi-dimensional scaling. It will be used primarily for visualization
of relatively large data sets.
What features should I look for in the software package, and does anyone
have suggestions for a particular package?
Thanks,
Don
------------------------------
Date:    Sat, 11 Jan 1997 23:53:27 GMT
From:    "Robert A. Belflower" 
Subject: Re: population census/estimates
Try http://www.census.gov/cgi-bin/gazetteer.  It contains a lot of data
from the 1990 US Census.
--
Rob
Robert A. Belflower
belflowr@der.1stnet.com
http://www.lee.1stnet.com/~belflowr/index.htm
Gordon Johnson  wrote in article
<5b905o$5h5@kraken.ifb.net>...
> 101614.1440@compuserve.com wrote:
>
> >Hello out there,
> We're here ( or hereabouts)!
> >I am looking for population census data and data for
> >population estimates of countries, administrative areas
> >and especially of towns and places. Until now I only
> >found some data from the US (though it seems not complete,
> >because I could not find population estimates eg for
> >Metairie, Louisiana; Citrus Heights, California, or
> >places of Hawaii like Hilo).
>
> >Is there anyone who can help me to find sources,
> >especially on the web? Please email me.
>
> >regards
> >S. Helders
> **I am not clear wherether this is a request for current,
> present-day data, or historical population data, this having
> appeared in a genealogy newsgroup. Can you clarify, as it makes a
> big difference about possible sources.
>
> Gordon Johnson's homepage -
> http://www.wintermute.co.uk/~kinman/
> (With Scottish genealogical goodies)
>
>
------------------------------
Date:    Thu, 9 Jan 1997 17:50:30 GMT
From:    Steven T Barlow 
Subject: Comparing R^2 for different DVs?
I have two regression equations.  Both use the same IVs but different
DVs. Is there a test for significance for difference between the R^2 for
the two equations?
Any help is greatly appreciated.
------------------------------
Date:    Sun, 12 Jan 1997 02:00:06 GMT
From:    Andrew Gray 
Subject: Combining Neural/Fuzzy Models with Statistical Models
Hi,
    I'm working on combining neural networks and fuzzy logic models
with statistical techniques (regression and data reduction) for
software metrics (for example, predicting development time based on
the type and size of system).  While there has been a lot of work on
neural-fuzzy, neural-genetic, fuzzy-genetic, etc. type systems I've
only ever found a small number of researchers using AI/statistical
techniques (presumably at least partially an indication of how few AI
researchers follow the statistical side of things, and vice versa).
If anyone out there is working on, or knows of some such work, then
I'd be really interested in hearing from you.  If anyone would like to
post their ideas on this to the newsgroup, it would be nice to get
some discussion going.
Cheers,
Andrew
Software Metrics Research Laboratory, University of Otago
Phone: +64 3 479 5282 Fax: +64 3 479 8311
email: agray@commerce.otago.ac.nz
http://divcom.otago.ac.nz:800/COM/INFOSCI/SMRL/people/andrew/andrewg.htm (H=
ome
Page)
http://divcom.otago.ac.nz:800/COM/INFOSCI/SMRL/home.htm (SMRL's Home Page)
------------------------------
Date:    Tue, 14 Jan 1997 10:49:56 CDT
From:    Ed Cook 
Subject: Re: Power for repeated measures designs
If you're conducting planned and focussed repeated measures tests
(i.e., comparing pairs of means without correction) then you
can conceptualize these as pairwise t-tests, and use the relevant
sections in Cohen's power text (pp. 48 ff.).
Ed Cook, Assoc Prof of Psychology, Univ of Alabama at Birmingham, USA
------------------------------
Date:    Mon, 13 Jan 1997 06:22:26 -0800
From:    Yi-Yi Chen 
Subject: Help--Application of a characteristic function
Can anyone tell me whether I can find the reference about the
relationship between a characteristic function and its "conditional"
expectation?  Most textbooks have only the relation between a
characteristic function and the "unconditional" expectation.  I need
this info because I have to use a characteristic function to figure out
a conditional expectation.
Any information will be appreciate!
------------------------------
Date:    Tue, 14 Jan 1997 13:30:58 GMT
From:    Mark Hansen 
Subject: 1997 New Researchers' Conference
                   CONFERENCE ANNOUNCEMENT
   The Third North American Conference of New Researchers
                       July 23-26, 1997
                       Laramie, Wyoming.
The purpose of this meeting is to provide a venue for recent
Ph.D. recipients in Statistics and Probability to meet and share their
research ideas. All participants will give a short expository talk or
poster on their research work.  In addition, three senior speakers
will present overview talks.  Anyone who has received a Ph.D. after
1992 or expects to receive one by 1998 is eligible.  The meeting is to
be held immediately prior to the IMS Annual Meeting in Part City, Utah
(July 28--31, 1997), and participants are encouraged to attend both
meetings.  Abstracts for papers and posters presented in Laramie will
appear in the IMS Bulletin.
The New Researchers' Meeting will be held on the campus of the
University of Wyoming in Laramie, and housing will be provided in the
dormitories.  Transportation to Park City will be available via a
charter bus.  Partial support to defray travel and housing costs is
available for IMS members who will also be attending the Park City
meetings, and for members of sponsoring sections of the ASA.
Additional information on the conference and registration is available
at the website: http://www.math.unm.edu/NR97.html.  Or contact
Prof. Snehalata Huzurbazar, Department of Statistics, University of
Wyoming, Laramie, WY 82071-3332, USA; email: lata@uwyo.edu; fax:
307-766-3927.
This meeting is sponsored in part by the Institute of Mathematical
Statistics; the National Science Foundation, Statistics and
Probability Program; the ASA Section on Bayesian Statistical Sciences;
the ASA Section on Statistical Computing; and the ASA Section on
Quality and Productivity.
---------------------------------------------------------------------------=
--
 Room 2C-260, Bell Laboratories
 Innovations for Lucent Technologies    Phone: (908) 582-3868
 700 Mountain Avenue                    Fax:   (908) 582-3340
 Murray Hill, NJ 07974                  Email: cocteau@research.bell-labs.c=
om
 URL: http://cm.bell-labs.com/who/cocteau/index.html
------------------------------
End of STAT-L Digest - 13 Jan 1997 to 14 Jan 1997 - Special issue
*****************************************************************

Return to Top

Subject: Modern Regression and Classification - Hawaii
From: Trevor Hastie
Date: Tue, 14 Jan 1997 11:35:46 -0800

************* 1997 Course Announcement *********
      MODERN REGRESSION AND CLASSIFICATION
       Waikiki, Hawaii: February 17-18, 1997
*************************************************
A two-day course on widely applicable statistical methods for
modeling and prediction, featuring
Professor Trevor Hastie    and   Professor Robert Tibshirani
Stanford University              University of Toronto
This course was offered and enthusiastically attended at five
different locations in the USA in 1996.
This two day course covers modern tools for statistical prediction and
classification. We start from square one, with a review of linear
techniques for regression and classification, and then take attendees
through a tour of:
 o  Flexible regression techniques
 o  Classification and regression trees
 o  Neural networks
 o  Projection pursuit regression
 o  Nearest Neighbor methods
 o  Learning vector quantization
 o  Wavelets
 o  Bootstrap and cross-validation
We will also illustrate software tools for implementing the methods.
Our objective is to provide attendees with the background and
knowledge necessary to apply these modern tools to solve their own
real-world problems. The course is geared for:
     o  Statisticians
     o  Financial analysts
     o  Industrial managers
     o  Medical and Quantitative  researchers
     o  Scientists
     o  others interested in  prediction and  classification
Attendees should have an undergraduate degree in a quantitative
field, or have knowledge and experience working in such a field.
PRICE: $750 per attendee if received by January 15, 1997. Full time
registered students receive a 40% discount.  Attendance is limited to
the first 60 applicants, so sign up soon!  These courses fill up
quickly.
TO REGISTER: Fill in and return the form appended.
For more details on the course and the instructors:
   o point your web browser to:
        http://stat.stanford.edu/~trevor/mrc.html
        OR send a request by
   o FAX to Prof. T. Hastie at (415) 326-0854, OR
   o email to trevor@stat.stanford.edu
<----------------------------- Cut Here ------------------------------->
 Please print, and fill in the hard copy to return by mail or FAX
                                REGISTRATION FORM
                    Modern Regression and Classification
             Monday, February 17 and Tuesday, February 18, 1997.
          Hilton Hawaiian Village, Waikiki Beach, Honolulu, Hawaii.
         Name   ___________________________________________________
                Last                 First                   Middle
         Firm or Institution  ______________________________________
        Standard Registration ____         Student Registration ____
         Mailing Address (for receipt)     _________________________
         __________________________________________________________
         __________________________________________________________
         __________________________________________________________
          Country                    Phone                      FAX
         __________________________________________________________
                               email address
       __________________________________________     _______________
       Credit card # (if payment by credit card)      Expiration Date
                  (Lunch preference - tick as appropriate):
         ___ Vegetarian                           ___ Non-Vegetarian
Fee payment can be made by MONEY ORDER , PERSONAL CHECK, or CREDIT CARD
(Mastercard or Visa.) For checks and money orders: all amounts are given in
US dollar figures. Make fee payable to Prof. T. Hastie. Mail it, together
with this completed Registration Form to:
Prof. T. Hastie
538 Campus Drive
Stanford
CA 94305
USA
For payment by credit card, include credit card details above, and mail to
above address, or else FAX form to 415-326-0854
For further information, contact:
Trevor Hastie
Stanford University
Tel. or FAX: 415-326-0854
e-mail: trevor@stat.stanford.edu.
http://stat.stanford.edu/~trevor/mrc.html
REGISTRATION FEE
Standard Registration: U.S. $750 ($950 after Jan 25, 1997)
Student Registration: U.S. $450 ($530 after Jan 25, 1997)
Student registrations - include copy of student ID.
- Cancellation policy: No fee if cancellation before Jan 25, 1997.
- Cancellation fee after January 25 but before Feb 12, 1997: $100.
- Refund at discretion of organizers if cancellation after Feb 12, 1997.
- Registration fee includes course materials, coffee breaks, and lunches
- On-site Registration is possible if course is not fully booked, at late
fee.

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.consult 21961

Directory

Articles