Newsgroup sci.stat.math 12226

Articles

Subject: Econometrics: Periodic Models.
From: kleong@tartarus.uwa.edu.au (Weng Chong Leong)
Date: 19 Dec 1996 17:13:56 GMT

Hi,
If anyone is researching on material like PAR models and periodic
integration and cointegration, I would appreciate some help with
certain problems I'm having.
Please reply by email.
Many thanks.
Kenneth.

Return to Top

Subject: Characterization of multivariate dsn
From: Theodore Sternberg
Date: 19 Dec 1996 18:30:52 GMT

Do there exist multivariate distributions that, like the normal, are fully
characterised by their first two moments, but unlike the normal are easily
evaluated (even in high dimensions)?  My dream distribution would look as
"normal" as possible, but be easy to evaluate. 
Would I need to look for an exponential-family distribution whose
sufficient statistics involve only sums, sums of squares and cross-sums? 
I suppose an infinite number of such beasts exist, and so the question is
how do I cook up one that has an easy-to-evaluate cumulative distribution? 
Am I on the right track here?
Ted Sternberg
San Jose, California USA

Return to Top

Subject: eigenvectors of rotated solution
From: "Ralf Schulze"
Date: 19 Dec 1996 19:03:07 GMT

Hello,
I'm trying to find the eigenvectors corresponding to the factors 
of a (varimax-)rotated principal components solution. I have three factors,
the transformation matrix, the unrotated solution of the factor pattern and
the "rotated eigenvalues" from the rotated factor pattern. I tried to
compute 
the eigenvectors for the rotated solution via:
F*D**-.5 = A
where A is the matrix of the eigenvectors, F is the rotated factor pattern
and D**-.5 is the inverse of a diagonal matrix with the square roots of the
eigenvalues in the principal diagonal.
This works fine for the unrotated solution (for which I have the 
eigenvectors, of course) but not for the rotated solution cause the
resulting
eigenvectors are not mutually orthogonal (I think they have to be
orthogonal,
but A'A is not diagonal). 
This is probably a simple question for you but I really can't figure 
out why the resulting eigenvectors are not orthogonal and therefore 
this simple formula doesn't work.
Thanks in advance for any helpful comments.
-- 
Ralf Schulze
LS II Psychologie
Universität Mannheim
EMail: schulze@tnt.psychologie.uni-mannheim.de

Return to Top

Subject: Re: Simulation - Monte Carlo?
From: radford@cs.toronto.edu (Radford Neal)
Date: 19 Dec 96 19:47:05 GMT

In article <32B85621.13D36AFF@ulst.ac.uk>,
JG.Campbell  wrote:
>I used to use the term 'Monte Carlo simulation' for simulation
>procedures like that described below although I'm fairly sure now that
>this is an abuse of the term...
>Simulation: we know the probability density/distribution of q, so,
>using appropriate random number generators, we generate values qi, and
>compute yi = f(x;qi).
>
>Using a large number of iterations, we estimate the distribution of y
>-- or some statistics. We used to call this a Monte Carlo 'loop'.
>
>Hence questions: (a) is this 'Monte Carlo simulation' in any reasonable
>interpretation of the term? (b) If not, is there another appropriate
>term?
It is quite correct to describe this as a "Monte Carlo simulation".
There term is rather general.  This might also be the best way to
do what you're doing, especially if q is high dimensional, and the
function f is complicated.
----------------------------------------------------------------------------
Radford M. Neal                                       radford@cs.utoronto.ca
Dept. of Statistics and Dept. of Computer Science radford@utstat.utoronto.ca
University of Toronto                     http://www.cs.utoronto.ca/~radford
----------------------------------------------------------------------------

Return to Top

Subject: Job Announcements - Jr. and Sr. Positions
From: chappell@becrux.biostat.wisc.edu (Rick Chappell)
Date: 19 Dec 1996 21:20:28 GMT

Job Openings Announcement  -  The University of Wisconsin Medical School
                              Department of Biostatistics
The U.W. Dept. of Biostatistics is seeking candidates for two tenure track
biostatistician positions - one Assoc. or Full professor and one Asst.
Professor.  Candidates should have PhD in biostatistics or statistics and track
record in teaching, statistical & collaborative research. Responsibilities
include teaching, research, & appropriate univ. & professional service.
Applicants: send resume and 3 letters of reference to David DeMets, Chair,
Dept. of Biostatistics, University of Wisconsin-Madison, 600 Highland Avenue,
Room K6/446, Madison, WI 53792-4676. Applications will be accepted until the
position is filled.  AA/EOE.  Women & minorities are encouraged to apply.
Unless confidentiality is requested in writing, information regarding
applicants must be released upon request.  Finalists cannot be guaranteed
confidentiality.

Return to Top

Subject: Re: SAS help
From: hamer@rci.rutgers.edu (Robert Hamer)
Date: 19 Dec 1996 17:28:20 -0500

nakhob@mat.ulaval.ca (Renaud Langis) writes:
>On Fri, 13 Dec 1996 00:07:22 -0500, Ya-Fen Lo 
>wrote:
>>Is it possible to perform tests of simple effects
>>(as defined in APPLIED STATISTICS by HINKEL/WIERSMA/JURS)
>>in SAS ? I am using the following setup
Yes.
>You can use the TEST statement in proc GLM. May be also in proc ANOVA. Do you
>simply want to know if an effect is significant? if so, just check the ANOVA
>table.
That is not what the original question asked.  That person wants
to contrast levels of one effect at specific levels of the other
effect.  One has to do that with CONTRAST or ESTIMATE statements.
>I suppose this is just a typing error but CLASSES should be written CLASS.
Actually, CLASS works just fine.  It is one of the several
alternative forms of the statement available.
-- 
--(Signature)      Robert M. Hamer hamer@rci.rutgers.edu 908 235 4218
  Do not send me unsolicited email advertisements.  I have never and
  will never buy.  I will complain to your postmaster.
  "Mit der Dummheit kaempfen Goetter selbst vergebens" -- Schiller

Return to Top

Subject: Need code for an ARIMA (1,0,1) Model
From: catullus@laraby.tiac.net (Robert Kelley)
Date: 19 Dec 96 23:27:11 GMT

Hello:
	I am searching for source code that calculates an ARIMA(1,0,1) model.
I would prefer if it is was in some flavor of Basic, but am willing to
look at any source code that can do it.  
	I would even use a stat-library; but would prefer not to use an
executable.  I need to to stick an ARIMA (1,0,1) Model into pre-existing
source code.
	Any leads on this would be *greatly* appreciated.
	catullus@laraby.tiac.net 
--
_______________________________________________________________________
Robert W. Kelley  (http://www.tiac.net/users/rkelley/)  "odi et amo..."
Nothing makes one so vain as being told that one is a sinner.
Conscience makes egotists of us all.

Return to Top

Subject: Modern Regression and Classification course - Hawaii
From: Trevor Hastie
Date: 19 Dec 1996 16:42:11 -0800

************* 1997 Course Announcement *********
      MODERN REGRESSION AND CLASSIFICATION
       Waikiki, Hawaii: February 17-18, 1997 
*************************************************
A two-day course on widely applicable statistical methods for
modeling and prediction, featuring
Professor Trevor Hastie    and   Professor Robert Tibshirani
Stanford University              University of Toronto
This course was offered and enthusiastically attended at five
different locations in the USA in 1996.
This two day course covers modern tools for statistical prediction and
classification. We start from square one, with a review of linear
techniques for regression and classification, and then take attendees
through a tour of:
 o  Flexible regression techniques
 o  Classification and regression trees
 o  Neural networks
 o  Projection pursuit regression
 o  Nearest Neighbor methods
 o  Learning vector quantization
 o  Wavelets
 o  Bootstrap and cross-validation
We will also illustrate software tools for implementing the methods.
Our objective is to provide attendees with the background and
knowledge necessary to apply these modern tools to solve their own
real-world problems. The course is geared for:
     o  Statisticians
     o  Financial analysts
     o  Industrial managers 
     o  Medical and Quantitative  researchers
     o  Scientists
     o  others interested in  prediction and  classification
Attendees should have an undergraduate degree in a quantitative
field, or have knowledge and experience working in such a field.
PRICE: $750 per attendee if received by January 15, 1997. Full time
registered students receive a 40% discount.  Attendance is limited to
the first 60 applicants, so sign up soon!  These courses fill up
quickly.
TO REGISTER: Fill in and return the form appended.
For more details on the course and the instructors:
   o point your web browser to: 
        http://stat.stanford.edu/~trevor/mrc.html
        OR send a request by
   o FAX to Prof. T. Hastie at (415) 326-0854, OR
   o email to trevor@stat.stanford.edu
<----------------------------- Cut Here ------------------------------->
 Please print, and fill in the hard copy to return by mail or FAX
                                REGISTRATION FORM
                    Modern Regression and Classification
             Monday, February 17 and Tuesday, February 18, 1997.
          Hilton Hawaiian Village, Waikiki Beach, Honolulu, Hawaii.
         Name   ___________________________________________________
                Last                 First                   Middle
         Firm or Institution  ______________________________________
        Standard Registration ____         Student Registration ____
         Mailing Address (for receipt)     _________________________
         __________________________________________________________
         __________________________________________________________
         __________________________________________________________
          Country                    Phone                      FAX
         __________________________________________________________
                               email address
       __________________________________________     _______________
       Credit card # (if payment by credit card)      Expiration Date
                  (Lunch preference - tick as appropriate):
         ___ Vegetarian                           ___ Non-Vegetarian
Fee payment can be made by MONEY ORDER , PERSONAL CHECK, or CREDIT CARD
(Mastercard or Visa.) For checks and money orders: all amounts are given in
US dollar figures. Make fee payable to Prof. T. Hastie. Mail it, together
with this completed Registration Form to:
Prof. T. Hastie
538 Campus Drive
Stanford
CA 94305
USA
For payment by credit card, include credit card details above, and mail to
above address, or else FAX form to 415-326-0854
For further information, contact:
Trevor Hastie
Stanford University
Tel. or FAX: 415-326-0854
e-mail: trevor@stat.stanford.edu.
http://stat.stanford.edu/~trevor/mrc.html
REGISTRATION FEE
Standard Registration: U.S. $750 ($950 after Jan 15, 1997)
Student Registration: U.S. $450 ($530 after Jan 15, 1997)
Student registrations - include copy of student ID.
- Cancellation policy: No fee if cancellation before Jan 15, 1997.
- Cancellation fee after January 15 but before Feb 12, 1997: $100. 
- Refund at discretion of organizers if cancellation after Feb 12, 1997.
- Registration fee includes course materials, coffee breaks, and lunches
- On-site Registration is possible if course is not fully booked, at late
fee.

Return to Top

Subject: Re: multicollinearity
From: jim bouldin
Date: Thu, 19 Dec 1996 23:10:42 -0800

Jim Bouldin wrote:
> > So, question one.  Would a solution be an analysis of covariance by
> > turning one of the two continuous ind variables into a categorical one
> > and using it as a covariate?  Within each category the correlation
> > between the two ind vars should be greatly reduced, right?  So I could
> > produce an estimate of the independent effects of each ind variable on
> > the dep variable, for each category.
>
T. Scott Thompson replied:
> No.  If all of the data are close to being on a line, then this is
> true for any subset as well.  In fact because we are now allowed to
> vary the line across groups, the within group collinearity will tend
> to be more severe.  If anything the problem is worse, since you have fewer data
> points within each group, and at least as much collinearity.
Scott, thanks for your responses and clear illustrations.  I understand
your points (I think).  Still, I think you are envisioning a higher
correlation of the ind variables than I am.  Imagine more of a data
cloud, with a general trend at a 45 degree angle away from the origin,
with an r of say 0.5.  If it is clear from a scatter plot that the x
axis can be broken into regions such that the correlation between the
ind variables in each of those regions is significantly less than the
correlation over the full range of the data, I don't see why ANCOVA
wouldn't be a suitable way of estimating the independent effects of the
two ind variables on the dep variable, at several levels of the
categorized ind variable.
> A final point: You say that you realize that you shouldn't make
> forecasts that involve varying a regressor outside its observed range.
> I can't think of any argument supporting this view that doesn't also
> tell you that you shouldn't make forecasts that vary a pair of
> regressors outside the observed range for the pair.  Extrapolation is
> extrapolation, whether the individual variables, considered one at a
> time, have reasonable values or not.
Ok, I see that--thanks.

Return to Top

Subject: Re: Early Stopping and Unblinded Assessment
From: orourke@utstat.toronto.edu (Keith O'Rourke)
Date: Tue, 17 Dec 1996 14:36:41 GMT

Return to Top
Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: Ian 
Date: Thu, 19 Dec 1996 15:47:19 +0000

Patrick Juola wrote:
> 
> In article <32B03222.41C67EA6@sees.bangor.ac.uk> Ian  writes:
> >Ilias Kastanas wrote:
> >>
> >> In article <329D8210.41C67EA6@sees.bangor.ac.uk>,
> >> Ian   wrote:
> >> >I'm intreged (spelling?) as to how the rules of proof were shown to be
> >> >_the_ rules of proof. Surely any such proof would have to be
> >> >self-referential, or rely on axioms.
> >>
> >>         It is in fact remarkable.  The logical axioms and Modus Ponens
> >>    are straightforward and almost simplistic; and yet they suffice.  For
> >>    every semantic implication, "in every structure where P holds, Q also
> >>    holds" there is a formal deduction of Q from P using those rules.  It
> >>    is the Completeness Theorem.
> >>
> >>                                                         Ilias
> >
> >
> >I am confused as to exactly what you mean by the completeness theorem.
> >You don't seem to have said anything here which invalidates my comments
> >on self-referentiality or reliance on axioms.
> 
> Actually, he did; he said that you hadn't done enough reading.
> What you're looking for is Godel's Completeness Theorem.  Basically,
> it demonstrates that, given a set of axioms (as a *variable*, in this
> context), if a sentence is true in all models satisfying the axioms,
> then it's derivable via 1-order logic (or in other words, true in all
> models implies provable).
> 
> The tricky bit (clever chap, Kurt) is that by quantizing over axioms,
> and because most of the work is done by the semantics, he can demonstrate
> that it doesn't matter what axioms you pick.
> 
>         Patrick
Four questions arise from this
1. Which logical system do you use to prove that a sentence is true in
   all models?
2. Which logical system do you use to prove that if a sentence is true 
   in all models  ... it's derivable by 1-order logic?
3. What exactly do you mean by quantising over axioms?
4. What do you mean by "the work is done by the semantics"?
cheers,
Ian

Return to Top
Subject: Re: How test signif. two numbers
From: aacbrown@aol.com (AaCBrown)
Date: 20 Dec 1996 15:53:42 GMT

dsmith@psy.ucsd.edu (David Smith) in
 writes:
> I get a mean of the squared errors for Zdata to
> Zone (call it A), and another mean of the squared
> errors for Zdata to Ztwo (call it B). B is around
> seven times larger than A. . . . [I]s there a way
> of measuring the statistical significance of the fact
> that B is much larger than A?
The usual approach is to do an F-test. This assumes that the prediction
errors for each model are i.i.d. draws from a Normal distribution with
mean zero and constant variance for each model (that is different
variances for the two models, but constant for all data points within the
model).
The main problem with this is that it fails to account for the fact that
you are evaluating on the same dataset you used to fit the models. If the
dataset is large and the models are simple with few parameters, this is
not a major problem.
It is also probably unreasonable to assume that the prediction errors of
the models are i.i.d. Normal (for example, there may be some error in the
dependent measurements, this would induce a correlation in the errors of
the two models). But my guess is that this will not be a big problem
unless you have outliers or lots of error (i.e. if your prediction models
do not predict well).
Aaron C. Brown
New York, NY

Return to Top
Subject: Re: Controlling for patients
From: aacbrown@aol.com (AaCBrown)
Date: 20 Dec 1996 16:08:53 GMT

"D.C.Lee"  in <32B72E03.289F@cms.cc.wayne.edu>
describes an experimental set-up and asks some questions.
I like to keep things concrete, so let me see if I understand your
question. You measure 20 patients on three physiologic variables for four
hours each. Say these are temperature, blood pressure and respiration
rate. Distributed at random within the four hours are some "events"; say
they are sneezes. You measure the events by a "sneezingness index" that is
near zero most of the time but shoots up to very high values around the
sneeze. You want to analyze the physiologic changes that occur near a
sneeze.
If this is a correct interpretation of your set-up, you are correct that a
multiple regression over the entire 80-hour sample is not a wise approach.
Cutting sample windows around the events makes much more sense. You are
further correct to be worried that you will have a patient effect because
some patients will have several sneezes, others none. There may be
correlations among the variables that depend on the patient.
However, it is usual to test for this after the analysis rather than
before. In other words, do the analysis as if the patients are identical
(or all different, it doesn't matter), then test the residuals for a
patient effect. This will give you a more sensitive and useful test of
whether you must correct for patient.
If you do correct for patient, it will have to be a simple adjustment
given that you have 20 patients and about 80 events. Fortunately, you have
the non-event data stream to use as a baseline. My advice would be to use
all non-event data per patient to fit a model; then measure residuals from
that model around the event.
This is what we do in Finance to study stock returns near and event such
as an earnings announcement. We use the previous non-event period to
estimate the correlation of the stock price with other variables, then we
look at the model's prediction errors around the event.
However I wonder if multiple regression is the appropriate tool. In most
physiologic data I have seen, the interations are much too complex for an
additive linear model.
Aaron C. Brown
New York, NY

Return to Top
Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.math 12226

Directory

Articles