Newsgroup sci.stat.math 11797

Articles

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: Bob Massey
Date: Wed, 20 Nov 1996 16:27:57 -0700

kenneth paul collins wrote:
> 
> Geoff Webb wrote:
> 
> > First, my paper addresses the common application of a technique or principle
> > in machine learning, often called Occam's razor, that seeks to minimise
> > the surface syntactic complexity of the inferred classifier in the expectation
> > that doing so will in general increase predictive accuracy.  I believe that
> > I have provided strong evidence that this is misguided.
> 
> I agree that the approach you refute is a misguided approach.
> 
> > Second, I think that this analysis gives reason to rethink a general, often
> > uncritical, acceptance of Occam's razor in a broader context.
> 
> This doesn't compute. The "surface syntactic complexity" minimization technique is
> completely locked up in the syntactic rules that are arbitrarily chosen. As long as such
> syntactic rules are in the mix, neither those syntactic rules, nor anything which
> refutes them, is generalizable. All such attempts can be disproven by presenting the
> system with something that "breaks" the syntax. (This is also my main objection to
> Goedel's "Incompleteness".)
> 
I would like your comment on my 2 objections to Goedel:
1. He uses a numerically hidden infinitely repeated subsitution. 
2. He admits his arithmitic dilema applies only to logic systems
that include arithmetic. Real computer logic is completely finite & does
not include mathematical induction necessary for Peano arithmetic. So
Goedel incompletenes, Turing, Penrose, Chaitin 'halting' do not apply to
real computers only Turing machiines with infinite memory - the
'diagonal' arguments assume constructability of infinitely long numbers!
Please see my easy interactive demo of 'computer cycling' - real
computers can only compute a finite number of cycling(rational) outputs
even if run forever!

RLMassey denver CO, e-mail rmassey@orci.com
http://www.csn.net/~pidmass
Negative feedback neural net increases entropy sooner.
Fetus souls romping in paradise thank god for abortion!

Return to Top

Subject: Strange model in nonparametric stats
From: poe@brutus.mts.jhu.edu (Randall C. Poe)
Date: 22 Nov 1996 04:58:15 GMT

I'm reading a couple of statistics papers that have to do with nonparametric estimation
in a setting with correlated measurement errors.  I find the model used for the
errors to be very strange, and I wonder if somebody could help me 
understand it intuitively.
The setting is this:  Measurements Y_i are modeled by
   Y_i = m(x_i) + e_i , at equally spaced values x_i = i/n, i = 1,2,... , n
 Asymptotic results are derived as the sampling rate n -> oo.
 The errors e_i are taken to be samples from a stochastic process
 Z(t) with autocovariance function g(t).   The i-th error is
 Z(a*x_i), where a is a scaling parameter.  Large a corresponds
 to e_i widely spaced in the stochastic process (covariance
 g(a*(x_i - x_j)) and therefore not very highly correlated.
 Here's what I find peculiar:  the parameter a also -> oo, at
 rate either faster than n (a/n -> oo), slower than n (a/n -> 0)
 or the same rate as n (a/n -> constant).   The different rates
 supposedly model different correlation regimes (termed
 asymptotically independent, long-range correlation, and
 short-range correlation).  These papers are entirely theoretical,
 with simulated data... no practical examples are given.
 My question:  when is such a model applicable?  In the case
 a = n, cov(e_i,e_j) = g(i-j) depends on the number of samples
 between i and j, not the distance between x_i and x_j.  In
 particular, the correlation between the error at x_1 and x_n
 (approximately x=0 and x=1) depends on how finely you
 sample.  The authors say this is an "ARMA process".  I'm sure
 this is a well-studied stochastic model with plenty of theoretical
 background.  I just want to know:  when does it apply?
 I find the asymptotic independent and long-range correlation
 models even more peculiar.  I could imagine the above model
 arising from errors in the measurement process, where some
 sort of error introduced by the measurement device also
 influences successive errors.  But what physical model
 could give rise to successive errors which are approximately
 independent when n is large and highly correlated when n
 is small?  Or, equally strange,  more highly correlated
 when n is large than when it is small?
 I'd be grateful for any plausible explanation.   The papers are
 Hall, Lahiri and Polzehl (Annals of Statistics, 1995), and
 Chu and Marron (Ann. Stat. 1991).
                   - Randy Poe

Return to Top

Subject: DOE Mixture Design Question
From: feltonraf@aol.com
Date: 22 Nov 1996 08:30:32 GMT

I am currently designing a mixture experiment to optimize an injection
molding compound.  There are only three variables, but one of them is
highly restricted relative to the other two.  For example, two of the
components range between 20 and 90 percent of the mixture, but the third
(which is known to have a large effect in small concentrations) should
only range between 2 and 3 percent.  
Analysis of this design shows VIF's of > 50,000!!!  It seems the more
restriced the range of the third component, the higher the VIF's.  After
reviewing sources such as Montgomery, VIF's > 100 are not considered good
due to multicolinearity.  I set the experiment up in terms of
psuedocomponents and it helped some, but VIF's are still >7000.
Should I really be all that worried about multicolinearity?  I am
expectecting to see a large difference in the response as the third
component goes from 2 to 3 percent of the mix.
Thanks in advance for any help you may be able to give me.
Richard Felton.
Richard Felton

Return to Top

Subject: regression tree software
From: Maria Wolters
Date: Fri, 22 Nov 1996 18:17:31 +0100

Hi all,
can anybody point me to a public domain tool for regression tree 
analysis? We need it for analysing linguistic, non-symbolic data.
Thanks a lot,
Maria Wolters
-- 
=======================================================================
Maria Wolters             
Institute for Communications Research and Phonetics 
University of Bonn  
e-mail: mwo@asl1.ikp.uni-bonn.de
=======================================================================

Return to Top

Subject: Re: DOE Mixture Design Question
From: rbcrosie@apgea.army.mil (Ronald B. Crosier)
Date: Fri, 22 Nov 96 15:57:07 GMT

In article <19961122083200.DAA27635@ladder01.news.aol.com>,
  wrote:
>I am currently designing a mixture experiment to optimize an injection
>molding compound.  There are only three variables, but one of them is
>highly restricted relative to the other two.  For example, two of the
>components range between 20 and 90 percent of the mixture, but the third
>(which is known to have a large effect in small concentrations) should
>only range between 2 and 3 percent.  
It's not possible for two components to vary over the range 20%-90% while
the third component varies from 2% to 3%.  If I take the lower bounds
as necessary, the four vertices of the region are:
 78%  20%   2%
 20%  78%   2%
 77%  20%   3%
 20%  77%   3%
If I take the upper bounds as a requirement, I get:
  7%  90%   3%
 90%   7%   3%
  8%  90%   2%
 90%   8%   2%
>Analysis of this design shows VIF's of > 50,000!!!  It seems the more
>restriced the range of the third component, the higher the VIF's.  After
>reviewing sources such as Montgomery, VIF's > 100 are not considered good
>due to multicolinearity.  I set the experiment up in terms of
>psuedocomponents and it helped some, but VIF's are still >7000.
The point of using the mixture components model is the interpretability
of the variables, their bounds, and their coefficients.  If you just want
to make predictions, a reparameterized model is just as good.  I suggest
using a model with a constant term and the two variables W1=log(X1/X2),
and W2=X3-2.5.  (You may scale them to the range -1 to 1 if you wish;
that way the regression coefficients will represent change over half the
range in the experiment.)  I have not looked at a diagram, but I'll guess 
that your region is a parallelogram; hence W1 and W2 will not be
orthogonal.  I'll also guess that the amount of multicollinearity due
to the parallelogram region is small: most of your multicollinearity is
due to the mixture variable parameterization.
>Should I really be all that worried about multicolinearity? 
I don't know.  There can be numerical problems with very ill-conditioned
data, but I don't have experience with that.  The statistical problem
with multicollinearity is that the regression coefficients are estimated
very poorly---they have a large variance and are said to be unstable.
This instability means that small (with respect to the experimental error)
changes in the data (Y, the dependent variable) can have very large effects
on the estimated coefficients.  Note that you can test this with simulated
data.  Perhaps I overgeneralized when I wrote (on sci.stat.edu):
  No one should ever do an experiment without analyzing it first.
So let me rephrase my experience: I have found it extremely useful to
analyze intended designs using simulated data before the experiment is 
done.  Obviously you have done something like this (to get the VIF's).
Good Work!  Now see if you will have the coefficient instability problem.
--
Ronald Crosier    E-mail: 
Disclaimer: My opinions are just that---mine, and opinions.

Return to Top

Subject: buy your tech books at Cody's on-line
From: ed@wells.com (Ed)
Date: Fri, 22 Nov 1996 19:31:59 GMT

The best supply of technical and science books is now available at
Cody's Books, one of the largest independent bookstores in world,
based in Berkeley, CA.  Cody's has the most complete section of
computer books, programming books, books on the life sciences,
engineering texts, and more.
If you need a book on SQL, Chi-Squared Statistics, Spacetime Physics,
or cold fusion – then come to fusion.
please feel free to mail order any book at:
http://www.codysbooks.com/
Unlike most on-line book shops, Cody's Books is a real store with real
knowledgeable people.  If you have any book questions, you can call us
or e-mail and we'll get back to you right away.
Of course, you will always find the best prices with the best service
at Cody's.  So be sure to check us out.
- Ed 
http://www.codysbooks.com/
p.s. We wish you the utmost reading pleasure.

Return to Top

Subject: Re: Attitude responses
From: aacbrown@aol.com
Date: 22 Nov 1996 20:02:32 GMT

Marks Nester  in
 writes:
> The data tell you something whether or not you reject
> the null hypothesis, e.g. the data may tell you that the
> two groups are quite similar in their attitudes.
If your null hypothesis is that two groups are identical, and you fail to
reject it, there are two possibilities. The groups may actually be
identical or you may not have enough data to distinguish them. Only the
second explanation allows you to ask for more money.
Aaron C. Brown
New York, NY

Return to Top

Subject: Re: DOE Mixture Design Question
From: aacbrown@aol.com
Date: 22 Nov 1996 20:07:58 GMT

feltonraf@aol.com in <19961122083200.DAA27635@ladder01.news.aol.com>
writes:
> I am currently designing a mixture experiment to optimize
> an injection molding compound.  There are only three
> variables, but one of them is highly restricted relative to the
> other two.  For example, two of the components range
> between 20 and 90 percent of the mixture, but the third
> (which is known to have a large effect in small concentrations)
> should only range between 2 and 3 percent. Analysis of this
> design shows VIF's of > 50,000!!!
If there are only three components then they will always add to 100%. This
will cause huge VIF's. The solution is to drop one variable from the
analysis, since it can be completely determined by the other two.
If there are other components of varying concentration you can still have
a highly correlated data set. You could either drop a variable or
transform your data to make things more independent.
Aaron C. Brown
New York, NY

Return to Top

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: ikastan@alumnae.caltech.edu (Ilias Kastanas)
Date: 22 Nov 1996 10:46:11 GMT

In article <329393FD.10CE@orci.com>, Bob Massey   wrote:
>kenneth paul collins wrote:
>> 
>> .....       . All such attempts can be disproven by presenting the
>> system with something that "breaks" the syntax. (This is also my main
>> objection to Goedel's "Incompleteness".)
	Side note: I didn't follow this exchange, but "breaking the syntax"
   is irrelevant to Goedel Incompleteness.
>I would like your comment on my 2 objections to Goedel:
>1. He uses a numerically hidden infinitely repeated subsitution. 
	He uses a specified finite procedure.  What do you mean by such a
   "substitution", and where does it appear?
>2. He admits his arithmitic dilema applies only to logic systems
>that include arithmetic. Real computer logic is completely finite & does
>not include mathematical induction necessary for Peano arithmetic. So
>Goedel incompletenes, Turing, Penrose, Chaitin 'halting' do not apply to
>real computers only Turing machiines with infinite memory - the
>'diagonal' arguments assume constructability of infinitely long numbers!
	They do not.  Their "numbers" are the standard, familiar integers.
   Where in Goedel's or Turing's papers did you see infinitely long ones?
	There _is_ recursion theory for infinite sequences of integers and
   the like, but it is not involved here.
>Arne D Halvorsen wrote:
>> 
>> First a few generally accepted results:
>> 
>> - Church-Turing-Post showed that no algorithm can in general decide
>> whether other programs will halt or keep going forever.
>> 
>
>Their arguments do not apply to real computers with finite memory - not
>well known.
	For finite-memory computers the non-existence of such an algorithm
   is easily seen, and does not _need_ the arguments of Church, Turing, etc.
>Proof: a deterministic automaton with finite memory has only a finite
>number of possible total internal states. So after a fixed starting
>input, it must repeat one of these total internal states after it has
>run long enough - possibly billions of years. Its determinism or
>consistency require it to repeat over & over the results after this
>repetition. Computers can compute only a finite number of different
>rational numbers or repeating integers; not even square root of 2 even
	A well-known result, part of every Formal Languages course.  The
   halting behavior of finite automata is simple.
	But it does not mean that _a finite automaton_ can determine the
   halting behavior of all finite automata.  For most of them it cannot
   even get started; they are bigger than it, they don't fit even just to
   be given as input.
	You might restrict to FAs of size <= some k; but it does not help.
>if run for ever. Theoretically, one only needs a larger computer, and a
>lot of time, to find the repetition point for any computer & program.
	Aha.  Solving the Halting problem for all computers in existence
   requires a computer larger than any computer in existence.  That's what
   I said too, above...  
	To put it another way, it is enough to have an "extensible memory"
   computer, a computer to which you can always add another disk whenever
   it beeps for one.  Such computers have a name.  They are called Turing
   machines.
	No, a TM does not need an "infinite" tape; only one that can be
   extended at will.  Yes, a TM solves Finite-Automata-Halting.
>The halting problem applies only to infinite memory, supernatural
>computers - Turing, Chaitin, Penrose are theological, not logical.
	Infinite-memory machines, IMs, had also better have "infinite time"
   ... the ability to carry out an infinity of steps here and there; other-
   wise they cannot access infinitely much memory.   They wouldn't really
   be IMs in that case.
	Sure, IMs are supernatural (or "aleph-0 minds", as C. Spector called
   the countable ones).  IMs easily solve the Turing Machine Halting Problem.
   But they don't solve their _own_ Halting.   Hmm...  eerily familiar.  May-
   be theology has its points; if Hilbert used it to solve the invariants
   problem we'll use it too on the 'ghost in the machine'.
	Hilbert, by the way, pointed out the usefulness of "ideal" objects...
   And R, the reals, physically impossible, hold a record amount of physi-
   cal applications.
>> - Turing showed that there are automata that can self-reproduce.
>> 
>> - There is also a result by Cohen (if I'm not mistaken), which may be
>> wrong and was rejected by some, that finding out whether an entity is
>> self-reproducing is as hard as the Halting Problem
>It was a big problem in the dark ages to compute the number of angels on
>a pin.
	Get that wrong and you'll have some angels with bruised, sore
   behinds to deal with.
	But there must be an angel on some pin of my CPU;  I have self-repro-
   ducing programs.  Compile the C source code file rep.c, put rep.c out of
   reach, run the executable, and its output is an exact copy of rep.c...
   character by character.   Actually, _lots_ of people have such programs.
   In LISP, C++, PASCAL, assembly...   Just like Turing and Kleene said.
							Ilias

Return to Top

Subject: Chaos and social science (from Re: difference between chaotic and random?)
From: c1809@umslvma.umsl.edu (Joel Winter)
Date: Fri, 22 Nov 96 19:26:39 GMT

>Unfortunately, no. The correlation function is based on linear association
>only, chaotic time series will have non-linear dependence.
>
>In principal you could get around this by using a more sophisticated
>measure of association. But another feature of chaotic systems is that
>even a small amount of random noise will have a large effect on your data.
>Therefore you are unlikely to get reliable results from this approach.
>
>In some cases the dependence is so complicated, or the system so unstable,
>that the series must just be treated as random. However, many chaotic
>systems have representations that are reasonably simple and not sensitive
>to random noise. If you can find one and figure it out, then you can make
>better predictions.
>Aaron C. Brown
>New York, NY
Has there been any use of chaos theory in social (statistical) sciences?  I 
hear people speak who jump on the 'chaos theory is everything' bandwagon, and 
I wonder if there has been any real statistical applications.
Would anyone care to point me towards some references if they exist?  (Note: 
I'm not a mathmatician)
=======================================
Joel Winter -- s994299@umslvma.umsl.edu
University of Missouri - St. Louis

Return to Top

Subject: Anouncement - Environmental Statistician web page index
From: pducat@u.washington.edu (Peter Sutherland)
Date: 22 Nov 1996 21:27:17 GMT

I'm putting together the web site for The National Center for
Environmental Statistics.  We are planning to keep a listing of home pages
for Environmental Statisticians.  The centers home page url is:
http://www.stat.washington.edu/NCES
The listing can be found in the resources link, or you can go there 
directly:
http://www.stat.washington.edu/NCES/environ_people.shtml
If anyone is interested in adding a link to their web page, please send me
the address of the web page and how you would like your name to appear.
The subject line should read:    Env stat list
Thanks
Peter Sutherland
cactus@stat.washington.edu

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.math 11797

Directory

Articles