![]() |
![]() |
Back |
kenneth paul collins wrote: > > Geoff Webb wrote: > > > First, my paper addresses the common application of a technique or principle > > in machine learning, often called Occam's razor, that seeks to minimise > > the surface syntactic complexity of the inferred classifier in the expectation > > that doing so will in general increase predictive accuracy. I believe that > > I have provided strong evidence that this is misguided. > > I agree that the approach you refute is a misguided approach. > > > Second, I think that this analysis gives reason to rethink a general, often > > uncritical, acceptance of Occam's razor in a broader context. > > This doesn't compute. The "surface syntactic complexity" minimization technique is > completely locked up in the syntactic rules that are arbitrarily chosen. As long as such > syntactic rules are in the mix, neither those syntactic rules, nor anything which > refutes them, is generalizable. All such attempts can be disproven by presenting the > system with something that "breaks" the syntax. (This is also my main objection to > Goedel's "Incompleteness".) > I would like your comment on my 2 objections to Goedel: 1. He uses a numerically hidden infinitely repeated subsitution. 2. He admits his arithmitic dilema applies only to logic systems that include arithmetic. Real computer logic is completely finite & does not include mathematical induction necessary for Peano arithmetic. So Goedel incompletenes, Turing, Penrose, Chaitin 'halting' do not apply to real computers only Turing machiines with infinite memory - the 'diagonal' arguments assume constructability of infinitely long numbers! Please see my easy interactive demo of 'computer cycling' - real computers can only compute a finite number of cycling(rational) outputs even if run forever!Return to TopRLMassey denver CO, e-mail rmassey@orci.com http://www.csn.net/~pidmass Negative feedback neural net increases entropy sooner. Fetus souls romping in paradise thank god for abortion!
I'm reading a couple of statistics papers that have to do with nonparametric estimation in a setting with correlated measurement errors. I find the model used for the errors to be very strange, and I wonder if somebody could help me understand it intuitively. The setting is this: Measurements Y_i are modeled by Y_i = m(x_i) + e_i , at equally spaced values x_i = i/n, i = 1,2,... , n Asymptotic results are derived as the sampling rate n -> oo. The errors e_i are taken to be samples from a stochastic process Z(t) with autocovariance function g(t). The i-th error is Z(a*x_i), where a is a scaling parameter. Large a corresponds to e_i widely spaced in the stochastic process (covariance g(a*(x_i - x_j)) and therefore not very highly correlated. Here's what I find peculiar: the parameter a also -> oo, at rate either faster than n (a/n -> oo), slower than n (a/n -> 0) or the same rate as n (a/n -> constant). The different rates supposedly model different correlation regimes (termed asymptotically independent, long-range correlation, and short-range correlation). These papers are entirely theoretical, with simulated data... no practical examples are given. My question: when is such a model applicable? In the case a = n, cov(e_i,e_j) = g(i-j) depends on the number of samples between i and j, not the distance between x_i and x_j. In particular, the correlation between the error at x_1 and x_n (approximately x=0 and x=1) depends on how finely you sample. The authors say this is an "ARMA process". I'm sure this is a well-studied stochastic model with plenty of theoretical background. I just want to know: when does it apply? I find the asymptotic independent and long-range correlation models even more peculiar. I could imagine the above model arising from errors in the measurement process, where some sort of error introduced by the measurement device also influences successive errors. But what physical model could give rise to successive errors which are approximately independent when n is large and highly correlated when n is small? Or, equally strange, more highly correlated when n is large than when it is small? I'd be grateful for any plausible explanation. The papers are Hall, Lahiri and Polzehl (Annals of Statistics, 1995), and Chu and Marron (Ann. Stat. 1991). - Randy PoeReturn to Top
I am currently designing a mixture experiment to optimize an injection molding compound. There are only three variables, but one of them is highly restricted relative to the other two. For example, two of the components range between 20 and 90 percent of the mixture, but the third (which is known to have a large effect in small concentrations) should only range between 2 and 3 percent. Analysis of this design shows VIF's of > 50,000!!! It seems the more restriced the range of the third component, the higher the VIF's. After reviewing sources such as Montgomery, VIF's > 100 are not considered good due to multicolinearity. I set the experiment up in terms of psuedocomponents and it helped some, but VIF's are still >7000. Should I really be all that worried about multicolinearity? I am expectecting to see a large difference in the response as the third component goes from 2 to 3 percent of the mix. Thanks in advance for any help you may be able to give me. Richard Felton. Richard FeltonReturn to Top
Hi all, can anybody point me to a public domain tool for regression tree analysis? We need it for analysing linguistic, non-symbolic data. Thanks a lot, Maria Wolters -- ======================================================================= Maria Wolters Institute for Communications Research and Phonetics University of Bonn e-mail: mwo@asl1.ikp.uni-bonn.de =======================================================================Return to Top
In article <19961122083200.DAA27635@ladder01.news.aol.com>,Return to Topwrote: >I am currently designing a mixture experiment to optimize an injection >molding compound. There are only three variables, but one of them is >highly restricted relative to the other two. For example, two of the >components range between 20 and 90 percent of the mixture, but the third >(which is known to have a large effect in small concentrations) should >only range between 2 and 3 percent. It's not possible for two components to vary over the range 20%-90% while the third component varies from 2% to 3%. If I take the lower bounds as necessary, the four vertices of the region are: 78% 20% 2% 20% 78% 2% 77% 20% 3% 20% 77% 3% If I take the upper bounds as a requirement, I get: 7% 90% 3% 90% 7% 3% 8% 90% 2% 90% 8% 2% >Analysis of this design shows VIF's of > 50,000!!! It seems the more >restriced the range of the third component, the higher the VIF's. After >reviewing sources such as Montgomery, VIF's > 100 are not considered good >due to multicolinearity. I set the experiment up in terms of >psuedocomponents and it helped some, but VIF's are still >7000. The point of using the mixture components model is the interpretability of the variables, their bounds, and their coefficients. If you just want to make predictions, a reparameterized model is just as good. I suggest using a model with a constant term and the two variables W1=log(X1/X2), and W2=X3-2.5. (You may scale them to the range -1 to 1 if you wish; that way the regression coefficients will represent change over half the range in the experiment.) I have not looked at a diagram, but I'll guess that your region is a parallelogram; hence W1 and W2 will not be orthogonal. I'll also guess that the amount of multicollinearity due to the parallelogram region is small: most of your multicollinearity is due to the mixture variable parameterization. >Should I really be all that worried about multicolinearity? I don't know. There can be numerical problems with very ill-conditioned data, but I don't have experience with that. The statistical problem with multicollinearity is that the regression coefficients are estimated very poorly---they have a large variance and are said to be unstable. This instability means that small (with respect to the experimental error) changes in the data (Y, the dependent variable) can have very large effects on the estimated coefficients. Note that you can test this with simulated data. Perhaps I overgeneralized when I wrote (on sci.stat.edu): No one should ever do an experiment without analyzing it first. So let me rephrase my experience: I have found it extremely useful to analyze intended designs using simulated data before the experiment is done. Obviously you have done something like this (to get the VIF's). Good Work! Now see if you will have the coefficient instability problem. -- Ronald Crosier E-mail: Disclaimer: My opinions are just that---mine, and opinions.
The best supply of technical and science books is now available at Cody's Books, one of the largest independent bookstores in world, based in Berkeley, CA. Cody's has the most complete section of computer books, programming books, books on the life sciences, engineering texts, and more. If you need a book on SQL, Chi-Squared Statistics, Spacetime Physics, or cold fusion – then come to fusion. please feel free to mail order any book at: http://www.codysbooks.com/ Unlike most on-line book shops, Cody's Books is a real store with real knowledgeable people. If you have any book questions, you can call us or e-mail and we'll get back to you right away. Of course, you will always find the best prices with the best service at Cody's. So be sure to check us out. - Ed http://www.codysbooks.com/ p.s. We wish you the utmost reading pleasure.Return to Top
Marks NesterReturn to Topin writes: > The data tell you something whether or not you reject > the null hypothesis, e.g. the data may tell you that the > two groups are quite similar in their attitudes. If your null hypothesis is that two groups are identical, and you fail to reject it, there are two possibilities. The groups may actually be identical or you may not have enough data to distinguish them. Only the second explanation allows you to ask for more money. Aaron C. Brown New York, NY
feltonraf@aol.com in <19961122083200.DAA27635@ladder01.news.aol.com> writes: > I am currently designing a mixture experiment to optimize > an injection molding compound. There are only three > variables, but one of them is highly restricted relative to the > other two. For example, two of the components range > between 20 and 90 percent of the mixture, but the third > (which is known to have a large effect in small concentrations) > should only range between 2 and 3 percent. Analysis of this > design shows VIF's of > 50,000!!! If there are only three components then they will always add to 100%. This will cause huge VIF's. The solution is to drop one variable from the analysis, since it can be completely determined by the other two. If there are other components of varying concentration you can still have a highly correlated data set. You could either drop a variable or transform your data to make things more independent. Aaron C. Brown New York, NYReturn to Top
In article <329393FD.10CE@orci.com>, Bob MasseyReturn to Topwrote: >kenneth paul collins wrote: >> >> ..... . All such attempts can be disproven by presenting the >> system with something that "breaks" the syntax. (This is also my main >> objection to Goedel's "Incompleteness".) Side note: I didn't follow this exchange, but "breaking the syntax" is irrelevant to Goedel Incompleteness. >I would like your comment on my 2 objections to Goedel: >1. He uses a numerically hidden infinitely repeated subsitution. He uses a specified finite procedure. What do you mean by such a "substitution", and where does it appear? >2. He admits his arithmitic dilema applies only to logic systems >that include arithmetic. Real computer logic is completely finite & does >not include mathematical induction necessary for Peano arithmetic. So >Goedel incompletenes, Turing, Penrose, Chaitin 'halting' do not apply to >real computers only Turing machiines with infinite memory - the >'diagonal' arguments assume constructability of infinitely long numbers! They do not. Their "numbers" are the standard, familiar integers. Where in Goedel's or Turing's papers did you see infinitely long ones? There _is_ recursion theory for infinite sequences of integers and the like, but it is not involved here. >Arne D Halvorsen wrote: >> >> First a few generally accepted results: >> >> - Church-Turing-Post showed that no algorithm can in general decide >> whether other programs will halt or keep going forever. >> > >Their arguments do not apply to real computers with finite memory - not >well known. For finite-memory computers the non-existence of such an algorithm is easily seen, and does not _need_ the arguments of Church, Turing, etc. >Proof: a deterministic automaton with finite memory has only a finite >number of possible total internal states. So after a fixed starting >input, it must repeat one of these total internal states after it has >run long enough - possibly billions of years. Its determinism or >consistency require it to repeat over & over the results after this >repetition. Computers can compute only a finite number of different >rational numbers or repeating integers; not even square root of 2 even A well-known result, part of every Formal Languages course. The halting behavior of finite automata is simple. But it does not mean that _a finite automaton_ can determine the halting behavior of all finite automata. For most of them it cannot even get started; they are bigger than it, they don't fit even just to be given as input. You might restrict to FAs of size <= some k; but it does not help. >if run for ever. Theoretically, one only needs a larger computer, and a >lot of time, to find the repetition point for any computer & program. Aha. Solving the Halting problem for all computers in existence requires a computer larger than any computer in existence. That's what I said too, above... To put it another way, it is enough to have an "extensible memory" computer, a computer to which you can always add another disk whenever it beeps for one. Such computers have a name. They are called Turing machines. No, a TM does not need an "infinite" tape; only one that can be extended at will. Yes, a TM solves Finite-Automata-Halting. >The halting problem applies only to infinite memory, supernatural >computers - Turing, Chaitin, Penrose are theological, not logical. Infinite-memory machines, IMs, had also better have "infinite time" ... the ability to carry out an infinity of steps here and there; other- wise they cannot access infinitely much memory. They wouldn't really be IMs in that case. Sure, IMs are supernatural (or "aleph-0 minds", as C. Spector called the countable ones). IMs easily solve the Turing Machine Halting Problem. But they don't solve their _own_ Halting. Hmm... eerily familiar. May- be theology has its points; if Hilbert used it to solve the invariants problem we'll use it too on the 'ghost in the machine'. Hilbert, by the way, pointed out the usefulness of "ideal" objects... And R, the reals, physically impossible, hold a record amount of physi- cal applications. >> - Turing showed that there are automata that can self-reproduce. >> >> - There is also a result by Cohen (if I'm not mistaken), which may be >> wrong and was rejected by some, that finding out whether an entity is >> self-reproducing is as hard as the Halting Problem >It was a big problem in the dark ages to compute the number of angels on >a pin. Get that wrong and you'll have some angels with bruised, sore behinds to deal with. But there must be an angel on some pin of my CPU; I have self-repro- ducing programs. Compile the C source code file rep.c, put rep.c out of reach, run the executable, and its output is an exact copy of rep.c... character by character. Actually, _lots_ of people have such programs. In LISP, C++, PASCAL, assembly... Just like Turing and Kleene said. Ilias
>Unfortunately, no. The correlation function is based on linear association >only, chaotic time series will have non-linear dependence. > >In principal you could get around this by using a more sophisticated >measure of association. But another feature of chaotic systems is that >even a small amount of random noise will have a large effect on your data. >Therefore you are unlikely to get reliable results from this approach. > >In some cases the dependence is so complicated, or the system so unstable, >that the series must just be treated as random. However, many chaotic >systems have representations that are reasonably simple and not sensitive >to random noise. If you can find one and figure it out, then you can make >better predictions. >Aaron C. Brown >New York, NY Has there been any use of chaos theory in social (statistical) sciences? I hear people speak who jump on the 'chaos theory is everything' bandwagon, and I wonder if there has been any real statistical applications. Would anyone care to point me towards some references if they exist? (Note: I'm not a mathmatician) ======================================= Joel Winter -- s994299@umslvma.umsl.edu University of Missouri - St. LouisReturn to Top
I'm putting together the web site for The National Center for Environmental Statistics. We are planning to keep a listing of home pages for Environmental Statisticians. The centers home page url is: http://www.stat.washington.edu/NCES The listing can be found in the resources link, or you can go there directly: http://www.stat.washington.edu/NCES/environ_people.shtml If anyone is interested in adding a link to their web page, please send me the address of the web page and how you would like your name to appear. The subject line should read: Env stat list Thanks Peter Sutherland cactus@stat.washington.eduReturn to Top