![]() |
![]() |
Back |
Christian Campbell wrote: > > I am a buyer of technical books at Brown University. So, I thought I'd go > to the people who read these books to find out which books are "must > have's!" If you have any suggestions, please e-mail me. I am > particularly interested in recent non-computer titles, but I also stock a > number of technical classics. > > Thank you, I collect what I call bibles fromt the various areas I have worked in, it is not a big list: Handbook of steel construction Fluid Dynamic Drag (and Lift) by Hoerner Analysis and Design of Flight Vehicle Strauctures, by Bruhn Marks' Standard Handbook for Mechanical Engineers Formulas for Stess and Strain, Roarke & Young Precision Machine design by Slocum Low Speed Aerodynamics, Plotkin and Katz The Finite Element Method by Zienkiewicz & Taylor Machine Design by Schigley (Rothbart is pretty good as well) Bill McEachernReturn to Top
hi, can someone recommend a book or manual for Minitab, Release 11 for Windows 95? In particular I'm looking for info how to interpret Minitab output of a multinomial logistic regression, and while my school has it on its machines, we don't seem to have any up-to-date manual around. I'd appreciate direct email or responses at least cc'd to me, as I don't get to this neck of the woods much. TIA! Tse-Sung (please note crossposting) _________________________________________________________________ .s.o.l.i.c.i.t.a.t.i.o.n, .j.u.n.k. .m.a.i.l. .u.n.w.e.l.c.o.m.e Tse-Sung Wu......................................tsesung+@CMU.EDU Engineering & Public Policy............Carnegie Mellon University BH-129...................................Pittsburgh, PA 15213 USA voice: +1 412-268-3005.......................fax: +1 412-268-3757 www.epp.cmu.edu/~tw1u/wu.html.............www.ce.cmu.edu:8000/GDIReturn to Top
In article <32AE9C28.197B@geomar.de>, Hildegard WestphalReturn to Topwrites: |> Hallo everybody! |> |> I would like to address to you with a problem concerning clustering of |> large data sets (1000 elements/30 attributes). I am looking for a Mac |> program that not only can handle such large data sets and apply |> different clustering algorithms (single linkage, average |> linkage..../Chi-2..../R-mode, Q-mode), but also can plot dendrograms. |> Does anyone have experiences with such programs and can give me a |> hint? Hint one: Don't forget about the curse of dimensionality. There is no possible way you can do clustering in anything approaching 30-space without major dimensionality reduction. Brad
New Zealand Statistical Association 48th Annual Conference University of Auckland Wednesday July 9--Friday, July 11, 1997 Themes of the Conference are Bayesian Statistics including Markov Chain Monte Carlo, and Statistical Ecology. It is expected that there will also be sessions on Official Statistics, Biostatistics, Statistical Theory, and Statistical Education. Contributed papers in any area of statistics will however be accepted for the conference program. Keynote speakers who have accepted invitations to speak at the Conference are Peter Hall (ANU), Luke Tierney (Minnesota), Steve Buckland (St Andrews), Keith Worsley (McGill), and Richard Huggins (La Trobe). Peter Hall's talk will be presented jointly with the joint meeting of the Australian Mathematical Society and the New Zealand Mathematics Colloquium, which is being held in Auckland from July 7 to July 11. Steve Buckland is to present a Workshop on Line Transect and Distance Sampling for Estimation of Wildlife Populations on the morning of July 11. The Workshop and the sessions on Statistical Ecology are intended to be interdisciplinary, bringing together researchers from Biology, Ecology and Statistics. Accommodation has been reserved for participants in the student residence Grafton Hall which is close to the University. The deadline for submission of abstracts is May 23, 1997. For further details concerning the Conference, or to register your interest, there is a link on the home page of the Statistics Department at the University of Auckland (http://www.stat.auckland.ac.nz/). Alternatively, contact Associate Professor David J Scott, Department of Statistics, Tamaki Campus, The University of Auckland, PB 92019, Auckland, New Zealand Phone: +64 9 373 7599 Fax: +64 9 373 7177 Email: d.scott@auckland.ac.nz or dscott@scitec.auckland.ac.nzReturn to Top
Can someone post an algorithm for the 4 moments in either basic or pascal with a brief explanation? tia art arte@panix.comReturn to Top
The most direct is this: Let F be the cumulative distribution function of N(0,1), let G be the inverse of F, let U be a uniform[0,1] random variable, and define X = G(U). (Approximations and algorithms for G are in many standard references.) Then pr(XReturn to Topwrote in article <58mqdb$au9@goliat.eik.bme.hu>... | Hi, | | Can anyone tell me an easy conversion scheme from variables, picked from a uniform distribution, to variables picked from a normal one? | | Thanx | | Babak |
jim bouldin (jrbouldin@ucdavis.edu) wrote: > Thank you Scott, and thanks to the others who replied to my original > question. The comments have been most helpful and understandable. I DO > want to sort out the separate effects of the ind variables for a world > that may be very different from the one the data came from. > Specifically, it has to do with the fact that even though temperature > and precipitation may be highly negatively correlated now (as one goes > up in elevation in the mountains), under various global warming > scenarios, they may not (i.e. probably will not) have the same > relationship in the future. The scientific value of my dissertation > will be increased greatly if I can offer some estimates of the change in > growth rates under various combinations of changes in the two variables > in the future, along with estimates based on assumptions of the current > relationships holding true. (I understand that I still must confine > reasonable predictions to the range of values for the ind variables in > the data set). You seem to be missing the point. Suppose precip and temp are exactly collinear and there are no other explanatory variables. Then the data all lie on some line in (temp,precip) space. Even if there was no regression error at all in your data you would not be able to separate the effects of precip and temp. To do that you need to have at least some variation in one of the indep variables that isn't perfectly explained by the other. Think of it this way. The estimated regression surface can be thought of as a plane in (precip,temp,z) space, where z is the dependent variable. If there weren't any errors, nor much collinearity, and if the world were really linear, then all of your measured data would lie on this plane. You could physically place a sheet of metal on top of the data points, then measure the slope of this sheet in the precip and temp directions in order to get the regression coefficients. Now suppose that you have perfect collinearity. Then your attempt to place the sheet of metal on top of the data will fail. You may have more than three points of support, but since they are all on a line, there is no unique way to rest the sheet on top of the data. You don't have three _independent_ points of support. The metal sheet can be tilted one way or the other, pivoting around the line that holds all of your data. As it pivots, the measured slopes in the two directions of interest will change. You can pin down the tilt by adding artificial data points away from the line. This is essentially what ridge regression and related methods do. However, the measured slopes are determined by the artificial points you choose, and not by the data. The situation in which there is a lot of collinearity, but not perfect collinearity is similar. Now all of your data are close to being on a line, but not exactly on a line. Any measurement errors will cause the resting angle of the piece of metal to pivot dramatically, essentially because you don't have any data away from the line. Your estimate will be very sensitive to small changes in the measurement errors of the dependent variable, hence highly variable. You can reduce this error by adding an artificial point away from the data (e.g. ridge regression), but the measured slopes are again not primarily determined by the data in that case. > Let me summarize my understanding now and pose another question or two. > Zar's idea (section 19.6 of the '96 edition of Biostatistics) of > removing the correlation betwween two ind variables by doing a > regression of the residuals that result from two linear regressions (in > my case temp vs precip and temp vs growth rate, to obtain the > relationship of precip vs growth rate with the effects of temperature > "removed"), will NOT provide an estimate of the effect of changing > precipitation on growth rate, independent of temperature changes. No. This works, but no better than using the original regression, except possibly through reducing numerical roundoff errors in your computer. If you could do exact calculations then this would yield exactly the same estimate of the effect of precip on growth as you get when regressing growth on precip and temp at the same time. Consider that if you have perfect collinearity then there are no residuals when you regress temp vs. precip, or more precisely, the residuals are all zero. A bunch of zeros aren't going to help you predict growth rates. > The > reason for this is that part of precipitation's effect on growth is > unrecoverably hidden in precip's correlation with temp. The residuals > explain only that part which is NOT correlated with temperature, not the > full effect of changes in precip. The residuals in the regression of precip on temp are necessarily uncorrelated with temp. However, the residuals in the regression of growth on temp have had are also uncorrelated with temp. So there is less variation in growth that the precip residuals have to explain. The mathematics of linear regression tell us that the slope of this residual/residual regression is exactly the same as the slope on precipitation in the original multivariate regression model. > So, question one. Would a solution be an analysis of covariance by > turning one of the two continuous ind variables into a categorical one > and using it as a covariate? Within each category the correlation > between the two ind vars should be greatly reduced, right? No. If all of the data are close to being on a line, then this is true for any subset as well. In fact because we are now allowed to vary the line across groups, the within group collinearity will tend to be more severe. > So I could > produce an estimate of the independent effects of each ind variable on > the dep variable, for each category. No. If anything the problem is worse, since you have fewer data points within each group, and at least as much collinearity. > (However, maybe the correlation > with the dependent variable will also be reduced, defeating the > purpose. Nevertheless, isn't my situation exactly what ANCOVA is > designed for?) > Qestion two. I know little about ridge regression but it seems to be > designed for this kid of problem, sacrificing accuracy for precision in > the estimates of coefficients. Is it worth trying? That depends on your loss function in making the estimates. Are you willing to live with numbers that have low variability, not because the data pins them down well, but because you have imposed an artificial and somewhat arbitrary constraint on the model. I'm not being facetious. There is an element of this in every model specification decision. Thinking hard about which assumptions you are willing to live with and which are unacceptable is important. > Someone also > mentioned principal components analysis, but I fail to see how different > linear combinations of the ind variables will allow the types of > predictions I'm looking for. PCA is roughly akin to the residual/residual method that you mentioned before. While it might help you solve numerical accuracy issues, it doesn't solve the fundamental problem. It doesn't help at all if you can do exact arithmetic, since it must then produce the same answers as the original regression. A final point: You say that you realize that you shouldn't make forecasts that involve varying a regressor outside its observed range. I can't think of any argument supporting this view that doesn't also tell you that you shouldn't make forecasts that vary a pair of regressors outside the observed range for the pair. Extrapolation is extrapolation, whether the individual variables, considered one at a time, have reasonable values or not. -- T. Scott Thompson email: thompson@charm.net Severna Park, Maryland phone: (410) 431-5027Return to Top
Daniel Nordlund wrote: > > Babak Fakhamzadeh wrote: > > > > Hi, > > > > Can anyone tell me an easy conversion scheme from variables, picked from a uniform distribution, to variables > picked from a normal one? > > > > Thanx > > > > Babak > > Below is sample C code to generate two random normal deviates > from two uniform random deviates. Uniform random deviates > are first generated by some function, ranf(), in pairs. > Compute x1 and x2. > > In the DO-WHILE loop, keep generating uniform deviates until > the point (x1,x2) falls inside the unit circle, > i.e. w = x1*x1+x2*x2 < 1. > > Then, compute y1 and y2, which are a pair of normal random > deviates, distributed as N(0,1). > > float x1, x2, w, y1, y2; > > do { > x1 = 2.0 * ranf() - 1.0; > x2 = 2.0 * ranf() - 1.0; > w = x1 * x1 + x2 * x2; > } while ( w >= 1.0 ); > > w = sqrt( (-2.0 * log( w ) ) / w ); > y1 = x1 * w; > y2 = x2 * w; > I should have mentioned that this technique is the polar coordinate form of the Box-Muller transform. There are a number of useful references listed at http://taygeta.com/random/gaussian.html DanReturn to Top
David SQUIRE (squire@cui.unige.ch) wrote: > I got very little response to this in sci.math, so I thought I'd try it here: > In article <588plu$s8b@uni2f.unige.ch>, squire@cui.unige.ch (David SQUIRE) writes: > >Dear all, > > > >I have a problem which I feel sure someone must have addressed before. I have > >a statistic which I compute on a matrix A of observations. This requires that > >I compute "a_{ij} choose 2" for each matrix element. For any real observation > >matrix, all the elements are integers, so this presents no problem. > > > >I also want to compute an expected value for this statistic. I know how to > >compute the expected value of each matrix element, but these are then no > >longer integers. Is there a sensible way to compute "a_{ij} choose 2" when > >a_{ij} is not an integer? (I suspect that gamma functions may be the answer). > > > >The expected a_{ij} are guaranteed to be rational, so at the moment I am doing the > >calculation for the integer values obtained by multiplying by the common denominator, > >and rescaling the result using the known maximum value for the implied sample > >size. Does this sound reasonable? No. It appears that you are attempting to analyze this statistic at the expected value of the data. That will only give the the expected value for the statistic if the statistic is linear in the data. However, your statistic is clearly nonlinear, and not even well defined at the expected value of the data. Try calculating your statistic for each possible value of the data _before_ you do any averaging. That is a more appropriate method, and it avoids having to define what \pi choose 2 might mean. -- T. Scott Thompson email: thompson@charm.net Severna Park, Maryland phone: (410) 431-5027Return to Top
Tonight on the way home from dinner I told some of my programming colleagues a funny story I got in email, but after the laughs it turned into a discussion of probability theory, followed by a bet on who's answer was right...please enjoy the short story below if you haven't already heard it, but if you can help settle our bet based on the resulting problem it brought up, I would appreciate it! The story as I recall it: (if you've already heard, please skip to 'the bet')... Two college students, supposedly from UVA, were doing quite well in their chemistry class and ended up having several free days during finals before their last exam in chemistry. Given that they only needed to get a D on their chem final to pass the course, they decided to enjoy the free days before the exam by partying at a neighboring college...in their ensuing drunkeness, they ended up oversleeping on the day they were supposed to return to take their 'easy' chem exam...technically, this meant they got a 0 and were now going to fail the course. They drove back to campus, and on the way concocted a story about how they had returned on time, but were seriously delayed due to a flat tire, and the ensuing towing and repair time. They told their professor the story, and begged for a make up exam - he consented, provided that they could not leave the classroom during the exam for any reason until they were done with their exam, and said they could take the exam the next day...they stayed up all night reviewing all of their chemistry knowledge. They met the professor the next day, and he then placed them into seperate rooms on each side of the hall, gave them their exams, shut the door, and he then sat in a desk in the middle of the hallway. The students then opened the exam book to see: 1)Question - worth 100 points: Which tire ? The bet: Hopefully you enjoyed the story, but now for the statistical bet which evolved - what is the probability of both students guessing the correct tire, given that they hadn't agreed beforehand on which tire had gone flat in their concocted story? Everyone but me said 25%, because their are four tires and only one correct choice. I disagreed, because my vague recollection of my probability course says that the real fact is that you are evaluating the probability of *two* people choosing the same tire, each of which has a 25% choice, so your odds of them both randomly picking the same tire should be less than 25%....i.e. potential outcomes are 16 different combinations of tires (i.e. Student A chooses Right Front, Student B could choose RF, LF, RB, LB. A could also choose Left Front, and B could again choose RF,LF,RB,LB...) So is it still a 25% chance they both pick the same tire, or is it 1/16 probability or ?? Actually though, now that I have written this out, I am starting to think I am wrong - b/c A could pick any given tire, and B could then pick any tire with a 25% probability of B's pick matching A...anyway, if you know the definitive answer and could help us resolve this gentlemen's bet, I would greatly appreciate it.... Thanks in Advance, LessReturn to Top
Could anyone tell me the definition of so-called "exponential order"? How is this different from "rate of convergence"? Thanks in advance. Tatsuo Ochiai tochiai@students.wisc.eduReturn to Top