Stan Baldwin (sabald01@pop.uky.edu) wrote: > I need a little help. I am not a statistician! > I'm running a 2-way ANOVA and I want to run post hocs. I know that I can > run post hocs like Bonferroni or Scheffe but if I understand things > correctly they assume that I'll be running all possible pairwise > comparisions. That's not what I want to do. Suppose my design has 100 > possible comparisons but I only what to make 25 comparisons. The > Bonferroni and Scheffe is overkill and I loose power. Is there a post > hoc that allows me to tell it how many comparisions of means I want to > make and then it compute significance? > Another question: In a 2-way ANOVA, if I get no interaction but do get > significant F scores for my groups, can I still run a post hoc comparing > individual means. Does anyone know of a good, clear article on this > subject. > Last question: What post hoc can I run if my ANOVA design is a repeated > measures, i.e. I have both dependent and independent observations. If I > want to compare two dependent means for possible differences, what post > hoc can I use? > Many thanks for any help!!!! > Stan > email sabald01@pop.uky.edu Since you seem to know which subsets of the levels of the factors and interactions are of interest, have you considered the use of contrasts (orthogonal or otherwise) incorporated in the anova? Leonard LefkovitchReturn to Top
Hi every one , I need some help. One of my friend working in a statistical problem and he reach to this problem. can any one give me a hint how to solve this problem. The problem is : Solve for x in terms of t , p and a in the follownig expression : 1 - p = [ 1 - f(x,t) ] / [ 1 - f(0,t) ] wher f(x,t) = ( 1 / PI ) * g(x,t) g(x,t) = sum_{m =1, inf}{ h(m) * h2(m,a,x,t) } h(m) = (-1)**(3m-3) / [ (2m-1)!(2m-1) ] h2(m,a,x,t) = GAMMA((2m-1)/a+1) (x+t)**(2m-1) GAMMA is the gamma function (i.e) GAMMA(a) = integral_{0,inf}{x**(a-1) exp(-x) dx} PI = 3.14...... please send me e-mail to fafst2@pitt.eduReturn to Top
Hein HundalReturn to Topin <3266C77F.2EC4@kincyb.com> writes: > Often the books use regression on half the data set and > use the second half of the data set to test the model. . . . > Is this a standard technique? . . . I don't know what data > mining is, but it sounds like something I might like to learn about. Data Mining is a disparaging term for running lots of analyses until you get the answer you want. For example, if you have 10 independent variables there are 1,024 regressions you could run by including or excluding variables. Plus each variable can be transformed in many different ways. If you keep searching you will find models that, just by chance, pass all normal statistical significance tests. But that does not mean they are reliable. The Validation procedure you describe is a good idea in many cases. Among other things it guards against data mining. If you have plenty of data, your results will be much more reliable if you fit on one data set and validate on another. However we often do not have enough data to do this. Aaron C. Brown New York, NY
This is a multi-part message in MIME format. --------------55FA6A2DC5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Richard F Ulrich wrote: > When similar polls have been conducted numerous times, the best > judgement of "How volatile is the race?" - is probably based on how > polls have varied in the past. They do vary about as the Standard > Errors would suggest, I am pretty sure, for one agency, though > there CAN be bigger differences between agencies. --------------55FA6A2DC5 Content-Type: text/plain; charset=us-ascii; name="Zogsim.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Zogsim.txt" I ask about how much volatility can be expected when a 4% 'margin of error" is advertised because of some results I got from a simulation of the tracking poll done by Reuters/Zogby. It seems to indicate that wide swings would be seen even if the underlying preferences don't change at all. Simulated Reuters tracking poll - 300 people sampled each day - 3-day running avergae reported (total sample 900) - 3.3% margin of error (95%) reported - assume that support for each candidate is constant: Clinton 50%, Dole 38%, Perot 6%. P = Perot, D = Dole, C = Clinton, x = other or undecided % preference (likely voters) Clinton lead (%) 0 5 10 30 35 40 45 50 55 60 5 10 15 20 |----|----| |----|----|----|----|----|----| |----|----|----| 3 | Px | | D C | | o | 4 | P x | | D C | | o | 5 | P x | | D C | | o | 6 | Px | | D C | | o | 7 | Px | | D C | | o | 8 | Px | | D C | | o | 9 | P x| | D C | | o | 10 | P x| | D C | | o | 11 | P x| | D C | | o | 12 | P x| | D C | | o | 13 | P x | | D C | | o | 14 | Px | | D C | | o | 15 | Px | | D C | | o | 16 | x P | | D C | | o | 17 | Px | | D C | | o | 18 | Px | | D C | | o | 19 | xP | | D C | | o | 20 | xP | | D C | | o | 21 | x P | | D C | | o | 22 | Px | | D C | | o | 23 | P x | | D C | | o | 24 | P x | | D C | | o | 25 | P x | | D C | | o| 26 | Px | | D C | | o | 27 | Px | | D C | | o | 28 | P x | D C | | o | 29 | Px | | D C | | o | |----|----| |----|----|----|----|----|----| |----|----|----| 0 5 10 30 35 40 45 50 55 60 5 10 15 20 So it can be seen that any amount of bogus "news" can be generated from such a poll, even if nothing is really changing except for inherent statistical gyrations. Every time the poll moves a smidgen, Zogby is out touting the shift as being due to this or that external event. Joe --------------55FA6A2DC5--Return to Top
Hein Hundal wrote: [snip] > I am also trying to find references for two other subjects: data > mining, and factor analysis. I don't know what data mining is, but it > sounds like something I might like to learn about. [snip] Data mining, as mentioned by another poster in this thread, is a term used pejoratively by statisticians, but more positively by computer scientists; for them, data mining means finding information in databases too large for human comprehension. I don't belong to the data mining camp, but I know they exist. You might try the home page of the Knowledge Discovery and Data Mining Foundation, http://www.kdd.org/, to see what literature you can find. Hope this helps, Robert DodierReturn to Top
In article <54ahpv$sh5@newsgate.dircon.co.uk>, Danny AlexanderReturn to Topwrote: >Does anybody have or know where I can find some code for the maximum >likelihood estimation of the parameters of a multivariate t model of >some data I have. Preferably including the code for MLE of the degrees >of freedom too. >If not, any good refernces for the algorithms for doing this? I do not know of any published algorithms; there may be some. But I would suggest that the logarithm of the density be written as C - .5*ln(det \Sigma) - (ln(1 + .5*h*(x-m)'\Sigma^{-1}(x-a)))/h + \phi(t), where \phi involves the logarithm of the \Gamma function of .5*|1/t +- p|, p being the dimension of the space, and other expressions so it behaves well at t=0. If the Beta distribution is to be excluded, t must be positive, but the maximum may occur at t=0. Not doing it this way can involve numerical instabilities. Newton's method, gradient methods, or other standard numerical procedures will work, starting from a good estiamtor. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 hrubin@stat.purdue.edu Phone: (317)494-6054 FAX: (317)494-0558
I have been suggested to use the Weibull distribution to model a distribution of fibre length distribution fibers in paper sheet? 1) What is the nature of the distribution i.e. where does it come from? 3) What stat program would be best to fit it? Thanks in advance for your cooperation J. HamelReturn to Top
Richard TimmerReturn to Topwrote: > So, I am asking whether some one has developed a solution, knows where I can > look, or provide helpful tips for determining the S.D. of "a" and "b" for a > fit to the following equation: > > y = a*(1-(EXP(-1*(b*x)))) (or in general terms, y = f(x) in which > f(x) contains two parameters a & b) Here's a reference: "Error analysis for parameters determined in nonlinear least-squares fits" by Keith H. Burrell, American Journal of Physics, v. 58 (2), Feb. 1990, pp. 160-4. Abstract: "This article includes a calculation of the error in parameters derived from the least-squares method of fitting nonlinear models to experimental data. The formula reduces to the well-known result for the case of a linear least-squares fit. It differs, however, from a method for calculating the error that is often employed for the nonlinear case. The difference between the current result and that method's is illustrated with examples from least-squares fits to spectroscopic data." -- Terry Anderson tga@math.appstate.edu Math Sciences Dept. Appalachian State University Boone, NC 28608 USA (704) 262 - 2357 http://www.mathsci.appstate.edu/u/math/tga/
Hein Hundal wrote: > > Often the books use regression on > half the data set and use the second half of the data set to test the > model, then they reverse the halves. (I.e. use the second half for the > regression and testing the result against the first half.) Is this a > standard technique? I have also seen the data set divided up into > thirds. Can anyone recommend references for getting the most > information out of a set of data? > > I am also trying to find references for two other subjects: data > mining, 1. The technique of splitting the data into 2 halves fitting to one and testing on the other is called 2-fold cross-validation. Yes, cross-validation is a standard technique. 5 and 10-fold cross-validation (where you fit to 80% and 90% of the data) are also common, as is fitting to all observations but one, then re-doing as many times as you have observations but leaving out a different observation each time. See 'A Leisurely Look at the Boststrap, the Jacknife and Cross-Vaildation' by Bradley Efron and Gail Gong, The American Statistician, Feb 1983 Vol 37 No 1. 2. 'Data mining is the process of automatically extracting non-obvious, hidden knowledge from a database.' A good reference is 'Knowledge Discovery in Databases' Edited by Gregory Piatetsky-Shapiro and William J Frawley, 1991 The AAAI Press. ISBN 0-262-66070-9 Blaise F Egan Data Mining Group BT LabsReturn to Top
CALL FOR PAPERS The Second International Symposium on Intelligent Data Analysis (IDA-97) Birkbeck College, University of London 4th-6th August 1997 In Cooperation with AAAI, ACM SIGART, BCS SGES, IEEE SMC, and SSAISB [ http://web.dcs.bbk.ac.uk/ida97.html ] Objective ========= For many years the intersection of computing and data analysis contained menu-based statistics packages and not much else. Recently, statisticians have embraced computing, computer scientists are using statistical theories and methods, and researchers in all corners are inventing algorithms to find structure in vast online datasets. Data analysts now have access to tools for exploratory data analysis, decision tree induction, causal induction, function finding, constructing customised reference distributions, and visualisation. There are prototype intelligent assistants to advise on matters of design and analysis. There are tools for traditional, relatively small samples and for enormous datasets. The focus of IDA-97 will be "Reasoning About Data". We are interested in intelligent systems that reason about how to analyze data, perhaps as human analysts do. Analysts often bring exogenous knowledge about data to bear when they decide how to analyze it; they use intermediate results to decide how to proceed; they reason about how much analysis the data will actually support; they consider which methods will be most informative; they decide which aspects of a model are most uncertain and focus attention there; they sometimes have the luxury of collecting more data, and plan to do so efficiently. In short, there is a strategic aspect to data analysis, beyond the tactical choice of this or that test, visualisation or variable. Topics ====== The following topics are of particular interest to IDA-97: * APPLICATIONS & TOOLS - analysis of different kinds of data (e.g., censored, temporal etc) - applications (e.g., commerce, engineering, finance, legal, manufacturing, medicine, public policy, science) - assistants, intelligent agents for data analysis - evaluation of IDA systems - human-computer interaction in IDA - IDA systems and tools - information extraction, information retrieval * THEORY & GENERAL PRINCIPLES - analysis of IDA algorithms - bias - classification - clustering - data cleaning - data pre-processing - experiment design - model specification, selection, estimation - reasoning under uncertainty - search - statistical strategy - uncertainty and noise in data * ALGORITHMS & TECHNIQUES - Bayesian inference and influence diagrams - bootstrap and randomization - causal modeling - data mining - decision analysis - exploratory data analysis - fuzzy, neural and evolutionary approaches - knowledge-based analysis - machine learning - statistical pattern recognition - visualization Submissions =========== Participants who wish to present a paper are requested to submit a manu- script, not exceeding 10 single-spaced pages. We strongly encourage that the manuscript is formatted following the Springer's "Advice to Authors for the Preparation of Contributions to LNCS Proceedings" which can be found on the IDA-97 web page. This submission format is identical to the one for the final camera-ready copy of accepted papers. In addition, we request a separate page detailing the paper title, authors' names, postal and email addresses, phone and fax numbers. Email submissions in Postscript form are encouraged. Otherwise, five hard copies of the manuscripts should be submitted. Submissions should be sent to the IDA-97 Program Chairs: Central, North and South America: Elsewhere: Paul Cohen Xiaohui Liu Department of Computer Science Department of Computer Science Lederle Graduate Research Center Birkbeck College University of Massachusetts, Amherst University of London Amherst, MA 01003-4610 Malet Street USA London WC1E 7HX, UK cohen@cs.umass.edu hui@dcs.bbk.ac.uk IMPORTANT DATES February 1st, 1997 Submission of papers April 15th, 1997 Notification of acceptance May 15th, 1997 Final camera ready paper Review ====== All submissions will be reviewed on the basis of relevance, originality, significance, soundness and clarity. At least two referees will review each submission independently. Results of the review will be send to the first author via email, unless requested otherwise. Publications ============ Papers which are accepted and presented at the conference will appear in the IDA-97 proceedings, to be published by Springer-Verlag in its Lecture Notes in Computer Science series. Authors of the best papers will be invited to extend their papers for further review for a special issue of "Intelligent Data Analysis: An International Journal". IDA-97 Organisation =================== General Chair: Xiaohui Liu Program Chairs: Paul Cohen, Xiaohui Liu Steering Comm. Chair: Paul Cohen, University of Massachusetts, USA Exhibition Chair: Richard Weber, MIT GmbH, Aachen, Germany Finance Chair: Sylvie Jami, Birkbeck College, UK Local Arrangements Chair: Trevor Fenner, Birkbeck College, UK Public. and Proc. Chair: Michael Berthold, University of Karlsruhe, Germany Sponsorship Chair: Mihaela Ulieru, Simon Fraser University, Canada Steering Committee Michael Berthold University of Karlsruhe, Germany Fazel Famili National Research Council, Canada Doug Fisher Vanderbilt University, USA Alex Gammerman Royal Holloway London, UK David Hand Open University, UK Wenling Hsu AT&T; Consumer Lab, USA Xiaohui Liu Birkbeck College, UK Daryl Pregibon AT&T; Research, USA Evangelos Simoudis IBM Almaden Research, USA Program Committee Eric Backer Delft University of Technology, The Netherlands Riccardo Bellazzi University of Pavia, Italy Michael Berthold University of Karlsruhe, Germany Carla Brodley Purdue University, USA Gongxian Cheng Birkbeck College, UK Fazel Famili National Research Council, Canada Julian Faraway University of Michigan, USA Thomas Feuring WWU Muenster, Germany Alex Gammerman Royal Holloway London, UK David Hand The Open University, UK Rainer Holve Forwiss Erlangen, Germany Wenling Hsu AT&T; Research, USA Larry Hunter National Library of Medicine, USA David Jensen University of Massachusetts, USA Frank Klawonn University of Braunschweig, Germany David Lubinsky University of Witwatersrand, South Africa Ramon Lopez de Mantaras Artificial Intelligence Research Institute, Spain Sylvia Miksch Stanford University, USA Rob Milne Intelligent Applications Ltd, UK Gholamreza Nakhaeizadeh Daimler-Benz Forschung und Technik, Germany Claire Nedellec Universite Paris-Sud, France Erkki Oja Helsinki University of Technology, Finland Henri Prade University Paul Sabatier, France Daryl Pregibon AT&T; Research, USA Peter Ross University of Edinburgh, UK Steven Roth Carnegie Mellon University, USA Lorenza Saitta University of Torino, Italy Peter Selfridge AT&T; Research, USA Rosaria Silipo University of Florence, Italy Evangelos Simoudis IBM Almaden Research, USA Derek Sleeman University of Aberdeen, UK Paul Snow Delphi, USA Rob St. Amant North Carolina State University, USA Lionel Tarassenko Oxford University, UK John Taylor King's College London, UK Loren Terveen AT&T; Research, USA Hans-Juergen Zimmermann RWTH Aachen, Germany Enquiries ========= Detailed information regarding IDA-97 can be found on the World Wide Web Server of the Department of Computer Science at Birkbeck College, London: http://web.dcs.bbk.ac.uk/ida97.html Apart from presentation of research papers, IDA-97 also welcomes demonstr- ations of software and publications related to intelligent data analysis and welcomes those organisations who may wish to partly sponsor the confe- rence. Relevant enquiries may be sent to appropriate chairs whose details can be found in the above-mentioned IDA-97 web page, or to IDA-97 Administrator Department of Computer Science Birkbeck College Malet Street London WC1E 7HX, UK E-mail: ida97-enquiry@dcs.bbk.ac.uk Tel: (+44) 171 631 6722 Fax: (+44) 171 631 6727 There is also a moderated IDA-97 discussion list. To subscribe, send the word "subscribe" in the message body to: ida97-request@dcs.bbk.ac.ukReturn to Top
Math/Probability book for sale: Daniel W. Stroock, "Probability Theory: An Analytic View," Cambridge University Press, 1993. (From the preface: This book is intended for graduate students who have a good undergraduate introduction to probability theory, a reasonably sophisticated introduction to modern analysis, and who now want to learn what these two topics have to say about each other...) Excellent condition, hardcover (with dust jacket): $29 shipped Has my name written inside but is otherwise in new condition. If you think the price is too high, make an offer. Bert Hochwald hochwald@research.bell-labs.comReturn to Top
Does anyone have any code available for maximum likelihood estimation of the parameters of multivariate t distributions from sample data? DanReturn to Top
In the 1986 SIGAPL Conf. Proceedings Groner and Cook Published “Arithmetic of statistical distributions”. In it they describe a suite of APL programs that uses “Prony’s method” to generate a representation of a set of distributions that they can then manipulate arithmetically. i.e. they can say something like Normal (1,3) + Normal ( 1.5,4) and get back the result Normal (2.5, 5) i.e. the sum of a normal distribution with mean=1 and stddev=3 with another independent normal distribution mean =1.5, stddev 4 is a normal distribution with mean=2.5 and stddev=5. I would very much like to do this sort of thing for LOGNORMAL distributions, (with a known covariance ). The article says that the authors would next work on multivariate distributions and lognormals, but I’ve been unable to locate any other work by them. They cite a 1981 article by Hellerman as reference #5, but provide only 4 references. Unfortunately the article is a little vague as to what Prony’s method is, and a web search seems to indicate that Prony’s method has something to do with signal processing. might anybody point me in the right direction, (considering that I’m no EE signal processing gearhead)? TIA Regards TonyReturn to Top
HI, Can anyone give me pointers on how to simulate populations of bivariate normal distribution with some given correlation? Help with simulation of other bivariate distributions (exponential e.g) is also appreciated. -- Thanks -DinkarReturn to Top
I need help in generating correlated random variables for a simulation program that I am preparing. All I've been able to find deals with normal variates and what I need is a general routine for variables that may not be normal. Can someone help me with this? I'm not a matematician and soem of the stuff I've come across is hard to digest. Thank you for your help AdelinoReturn to Top
Tony Corso wrote: > Unfortunately the article is a little vague as to what Prony’s method is, and > a web search seems to indicate that Prony’s method has something to do with > signal processing. > Prony' method is a way of modelling sampled data as a linear combination of exponentials. Original reference: de Prony, Baron (Gaspard Riche), [a very, very long title in French], J. Ec. Polytech., vol. 1, cahier 22, pp. 24-76, 1795. If your library doesn't have that try Digital Spectral Analysis with Applications, S. Lawrence Marple, Prentice-Hall, 1987. or just about any other text that has a "Spectrum Analysis", or "Spectrum Estimation" in the title. > might anybody point me in the right direction, (considering that I’m no EE > signal processing gearhead)? I should say not. Gearheads are M.E.s -- JimReturn to Top
CWBern wrote: > > I know there are many methods for dealing with outliers. My question is > for the pharmaceutical or med device people. What is the method (is there > a standard?) that is acceptable to the FDA when an outlier is throwing off > the calculations. Specifically, two extreme outliers are causing the > normality assumption to be violated. This is in regards to a validation > study. If the outliers are not included, the data becomes normal, and > everything work out nicely. > How can I justify not including these outliers in the calculations. ( my > customer does not want to change his analysis test, so please don't tell > me of some fantastic non-parametric test). Please refer to any written > standards or industry wide accepted techniques. > > thanks. Well, it depends which part of the FDA is interested and may, in fact, depend on whom you are working with. However, there is a guideline to use Grubbs' method for identifying outliers. From my experience, it is highly unlikely that an outlier can be generated by any well-behaved physical process that will violate Grubbs' test. We have usually found that these "outliers" have either been misrecorded or misread from the physical instruments themselves. In that case, Grubbs' method is very helpful in identifying these values. Frank Grubbs, Procedures for Detecting Outlying Observations in Samples, Technometrics, Vol. 11, No. 1, February, 1969, pp. 1-21 RodneyReturn to Top
How can I recover the random sequence y from the following? z_t = y_t + x_t where z is known, and all we know about y and x is 1) their respective variances, 2) y is serially uncorrelated, 3) x is serially correlated, 4) x is independent of y. Might as well assume everything is normally distributed. *************** Ted Sternberg San Jose, California, USAReturn to Top
I work with date-related data which is *very* noisy and may have no observations on one date and multiple observations on another date. I have found that an effective strategy for extracting trend information is to group the data by uniform periods (usually weeks), and then use a binomial smoothing function covering an odd number of time periods to compute a running average. This is, I believe, the discrete equivalent to using a Gaussian smoothing function. Each point in the smoothed data is effectively a weighted arithmetic mean of single observations. By making sure that the smoothing function is normalized to a sum of 1.00, one can also derive an effective number of observations contributing to each point. The total number of observations contributing to the smoothed results totals to slightly less than the actual number input because the smoothing process effectively smears observations near the ends beyond the actual date range observed. The problem is that I am not certainhow to calculate the standard error and (say) 95% C.I. for the smoothed (weighted mean) observations. I have worked through the calculations arguing by analogy with ordinary calculations, and one ends up having non-integral degrees of freedom for each point. The results of this calculation look moderately convincing, but I would be *much* happier if I had an appopriate reference, or advice from someone knowledgable about this. None of the (relatively elementary) references I have at my disposal touch on calculating the statistics of smoothed data nor of weighted means. Replies by email welcome ---- Rodger WhitlockReturn to Top
Consider an X-Y graph (#1) in which both X and Y are REAL variables that can vary from -infinity to +infinity. The area of this graph is therefore infinity*infinity (...?...). Now, restrict Y to be greater than, or equal to, zero. Is the area of this new graph (#2) 1/2infinity*infinity? What if Y is restricted to the range (1,10). What is the area of this third graph? (10-1)*infinity? What is the ratio of the area of this plane divided by the area of the first plane? Is it: 9*infinity/(infinity*infinity)=9/infinity? What is the area of graph #3/graph#2? Is it: 9*infinity/(0.5*infinity*infinity)=18/infinity? If all of this is true then does it follow that (18/infinity) is twice as large as (9/infinity)? A reference would be appreciated.Return to Top
Have you ever seen the following theorem before: "A gambler takes repeated binomial gambles. At each step he/she invests a fraction alpha of their current wealth. Researchers at Bell Labs have proved that the optimal strategy in this situation is to set alpha equal to p - q ". Is this correct? Who proved this and where is it published? Thanks for any info.Return to Top
Hi, Could someone please suggest on how to tackle the following problem? Q: The receiver of a digital communication system is designed to operate with a BER (bit error rate) of 10^(-10). The reciver must decide whether the observed signal is greater than the decision level and hence a digital one (or mark) is present or whether the observed signal is less than the decision level and hence a digital zero (or space) is present. To operate at a BER of x, the receiver can only make a mistake a fraction x of the time. If the receiver has an rms noise level of 12 mV, and the noise is caused by a multitude of factors, find the decision level that just permits operation at the specified BER. Thanks in advanceReturn to Top
I was wondering whether the solution to the following problem is known If f is uniformly distributed between fmin and fmax and phi is uniformly distributed between -pi and +pi is the distribution of cos(2*pi*f*t+phi) known? The random variables f and phi are assumed independent. The variable t is deterministic (e.g. time) Any info. gratefully received Ben RickmanReturn to Top
-Is the Logistic distribution known to be Polya of some order r (2<= r <=infinity)? -Related question: does anyone know a "reasonable" expression for the characteristic function of the Logistic? -Thanks Ray ------------------------------------------------------------------------------ R.G. Vickson Department of Management Sciences University of Waterloo (519) 888-4729Return to Top
HI! I'm taking a first year statistics course and were doing a lot of probabilities and I having a hard time learning this material. So I was wondering if anyone can refer me to some really good books that can help me a lot or any computer software that will help me with learning of probabilities. Thank you so very much. G.ChanReturn to Top
In article <845940391.28083.0@ciscr40.demon.co.uk>, cr40@cityscape.co.uk (Ben Rickman) writes: >I was wondering whether the solution to the following problem is known > >If f is uniformly distributed between fmin and fmax and phi is >uniformly distributed between -pi and +pi is the distribution of > > cos(2*pi*f*t+phi) > >known? Correction. I made an error on my previous post. Don't flame too hard! The error was: u(-pi,pi] + 2*fmid*t*u(-pi,pi] = (1 + 2*fmid*t)*u(-pi,pi] It looked good when working it out, but you can't do that! You need to do a convolution between u(-pi,pi] and 2*fmid*t*u(-pi,pi] to get fx(x), then substitute ... BarryReturn to Top
In article <845940391.28083.0@ciscr40.demon.co.uk>, cr40@cityscape.co.uk (Ben Rickman) writes: >I was wondering whether the solution to the following problem is known > >If f is uniformly distributed between fmin and fmax and phi is >uniformly distributed between -pi and +pi is the distribution of > > cos(2*pi*f*t+phi) > >known? Yes. For the general case of y=cos(x), the solution is in most textbooks as a function of the distribution of x, ie fx(x). The problem is to determine the distribution of x. u(-pi,pi] + 2*fmid*t*u(-pi,pi] = (1 + 2*fmid*t)*u(-pi,pi] 1) find fmid 2) the problem reduces to finding the distribution of g(t)*u(-pi,pi], and inserting it into the solution to y=cos(x). 3) the solution will be a sum over all values of x which will produce a solution y. ie, for t=0, there are two values of x that will give the same solution y (plot cos(x) and you'll see why). Hope this helps. Barry Vanhoff bvanhof@eecs.wsu.eduReturn to Top
Special Issue on RANDOM SETS in the Journal of PATTERN RECOGNITION edited by Ilya Molchanov (University of Glasgow) Edward Dougherty (Texas A&M; University) Random sets have played a role in image processing and pattern recognition since the seminal text, RANDOM SETS AND INTEGRAL GEOMETRY, by George Matheron (1975). At first there were only a few researchers using random set theory but recently the number has begun to grow. Two conferences devoted to random sets occured in 1996, one at the University of Minnesota and another at the Ecole des Mines in Fontainebleau. The purpose of the present special issue is to provide a forum for current research in both the theory and application of random sets, and to give a sampling of current trends to a wide audience. Potential topical areas include random-set theory, spatial statistics, image analysis, random geometry, texture analysis, random-set modeling for pattern recognition, filtering in the context of random sets, stochastic mathematical morphology, coverage processes, point processes, random measures, set statistics, and applications in all of the aforementioned areas. The final date for manuscript submission is November 1, 1997. All manuscripts will be peer reviewed for acceptance. For consideration, please submit four (4) copies of the complete manuscript to Dr. Ilya Molchanov University of Glasgow Department of Statistics Glasgow G12 8QW Scotland, U.K. ------------------------------------------------------------------------ I. Molchanov University of Glasgow : e-mail: ilya@stats.gla.ac.uk Department of Statistics : Ph.: + 44 141 339 8855 ext 2116 Glasgow G12 8QW : Fax: + 44 141 330 4814 Scotland, U.K. : http://www.stats.gla.ac.uk/~ilya/ ------------------------------------------------------------------------Return to Top
I'm working on a random number generator that is to be used in some simulation (it is to be coupled with a structural analysis program) and have run into a problem that I haven't been able to solve or to find a reference that might help me solve it: the generation of correlated random variables. So far I've come across an algorithm to generate correlated normal variates but wht I need is a genreal procedure that is applicable to the variates, regardless of their particular distribution... does such a beast exist? Where can I find it? Your help is most appreciated. AdelinoReturn to Top
****************************************************************** ********** Questionnaire *************** * Your experience with interactive graphical statistics software * ****************************************************************** The purpose of this questionnaire is to gather information about the status of interactive statistical graphics in contemporary practice. These are some of the questions that we would like to address using your responses: How are statisticians and other data analysts using interactive, dynamic statistical graphics? Which methods, if any, have come into wide use? Which methods have yet to demonstrate their usefulness? What new methods or extensions to existing methods are frustrating analysts by their absence? The results of this survey are to appear in a special issue of Computational Statistics. We also invite the submission of papers about applications of interactive statistical graphics. Please send your responses, questions and other comments to statvis@bellcore.com. Paper mail can be sent to Deborah Swayne Bellcore 445 South Street MCC-1A316B Morristown, NJ 07960-6438 USA 1: Describe your data. (If you would like to comment on more than one data problem, describe as many sets of data as you like. Label your examples however you like, and then you can refer back to them when you comment on the methods.) If your data are proprietary, please say as much about them as you find appropriate. What is the subject of the data? What is the structure of the data? How many cases, variables? What kind of variables -- nominal, ordinal, metric-continuous, metric-discrete? Are there missing values? 2: What interactive graphics software did you use, or try to use? On what computing platform(s) -- a UNIX workstation, a PC running MS Windows, PC running OS/2, PC running LINUX, other? DataDesk Lisp Stat interactive graphics S Plus interactive graphics SAS Insight or JMP XGobi XploRe Voyager other ... 3: For the following methods ... Which did you try? Did you find it reasonably easy to use? Was it useful? If not, why not? If so, what did it tell you about your data? (Be as terse or as verbose as you like.) Interactive methods Brushing, linked brushing Between like plots: boxplot, scatterplot, barchart ... Between unlike plots: boxplot to scatterplot, scatterplot to barchart ... Scaling Identification, linked identification Subsetting of the data set Viewing multiple plots in rapid succession One-variable plots Two-variable plots Higher-d plots Parallel coordinate plots Rotation Grand tour Manipulation of the grand tour direction Interactive projection pursuit Which indices? Did you print out many plots? 4: What methods or software do you expect to use in the future? Are there any that you plan never to use again? (This is a good place for appreciative testimonials or expressions of frustration or disgust.) 5: What would you have liked to do that you couldn't find a way to do? 6: If you could publish an account of your experiences with interactive graphical software for data analysis, would you like to do so? 7: About you: Where do you work? (check all that apply) in industry for a university for yourself other What is your field of work or study? Agriculture Biology Chemistry Economics Engineering Statistics other Did you receive this questionnaire on the s-news mailing list? on usenet? which group(s)? by direct email? other 8: Any other comments? Thank you very much for your participation. We hope that gaining some answers to these questions will help guide future research and development of graphics for data analysis. Again, send email to statvis@bellcore.com, and paper mail to Deborah Swayne at the address given above. Deborah Swayne Sigbert Klinke Bellcore Humboldt-University of BerlinReturn to Top
Ben Rickman writes: >I was wondering whether the solution to the following problem is known > >If f is uniformly distributed between fmin and fmax and phi is >uniformly distributed between -pi and +pi is the distribution of > > cos(2*pi*f*t+phi) > >known? > >The random variables f and phi are assumed independent. The variable t >is deterministic (e.g. time) Assuming that you are picking one time and measuring this function, then the arbitrary frequency does not matter: you are equally likely to pick any given phase of the cosine. Given y=cos(x), the probability of getting a certain value of y for uniformly distributed x is given by the derivative of the inverse. Thus P(y) = 1/(1-y^2)^(1/2), -1<=y<=1, 0 otherwise. If you are picking a bunch of times, each measurement will be distributed this way, but they will not be independent (not iid). -Bill billt@leland.stanford.eduReturn to Top
In article <326D24F5.41C67EA6@bechtel.Colordao.edu>, DE ALMEIDA ADELINOReturn to Topwrote: >[I] have run into a problem that I haven't been able to solve or to find a >reference that might help me solve it: the generation of correlated >random variables. > >So far I've come across an algorithm to generate correlated normal >variates but wht I need is a genreal procedure that is applicable to the >variates, regardless of their particular distribution... does such a >beast exist? Where can I find it? No such beast exists, because the problem is not well defined. In general, two random variables with given distributions can have a given degree of correlation in many ways. The normal distribution is an exceptional case, in which specifying the marginal normal distributions and the degree of correlation suffices to determine the joint distribution. To see this, consider the simplest case: Generating pairs of values for X and Y in which both X and Y have the uniform(-1,1) distribution, with zero correlation between X and Y. One way of achieving this is to make X and Y independent, generating uniformly from the square in the (x,y) plane with corners at (-1,-1) and (1,1). Another, very different, way is to generate uniformly from the union of two line segments, one going from (-1,-1) to (1,1) and the other from (-1,1) to (1,-1). Radford Neal ---------------------------------------------------------------------------- Radford M. Neal radford@cs.utoronto.ca Dept. of Statistics and Dept. of Computer Science radford@utstat.utoronto.ca University of Toronto http://www.cs.utoronto.ca/~radford ----------------------------------------------------------------------------
In article <326D24F5.41C67EA6@bechtel.Colordao.edu>, DE ALMEIDA ADELINOReturn to Topwrote: >[I] have run into a problem that I haven't been able to solve or to find a >reference that might help me solve it: the generation of correlated >random variables. > >So far I've come across an algorithm to generate correlated normal >variates but wht I need is a genreal procedure that is applicable to the >variates, regardless of their particular distribution... does such a >beast exist? Where can I find it? No such beast exists, because the problem is not well defined. In general, two random variables with given distributions can have a given degree of correlation in many ways. The normal distribution is an exceptional case, in which specifying the marginal normal distributions and the degree of correlation suffices to determine the joint distribution. To see this, consider the simplest case: Generating pairs of values for X and Y in which both X and Y have the uniform(-1,1) distribution, with zero correlation between X and Y. One way of achieving this is to make X and Y independent, generating uniformly from the square in the (x,y) plane with corners at (-1,-1) and (1,1). Another, very different, way is to generate uniformly from the union of two line segments, one going from (-1,-1) to (1,1) and the other from (-1,1) to (1,-1). Radford Neal ---------------------------------------------------------------------------- Radford M. Neal radford@cs.utoronto.ca Dept. of Statistics and Dept. of Computer Science radford@utstat.utoronto.ca University of Toronto http://www.cs.utoronto.ca/~radford ----------------------------------------------------------------------------
Hi, I have a problem I need to linearize a nonlinear stochastic ODE of Ito type? Does anyone have any ideas? It's actually a system of coupled ODE's with state dependent diffusion. Can I just Taylor expand it? Rodney Beard University of QueenslandReturn to Top
I am looking for a treatment of the problem of curve fitting in the case where there is NO INDEPENDENT variable. For example, you have a sensor system that measures space and time (all subject to errors) and you need to fit a curve to the data. I initially dealt with this problem for the case of fitting lines. As usual, the solution is derived from minimizing the sum of the squared distances of the data from a putative line. Points on the line are indexed by a free parameter. In linear regression this "parameter" is the independent variable and is KNOWN. In this case - there is no independent variable and there is an additional initial step to the solution in which the free parameter must, in effect, be determined by finding the line passing through the data point that is perpendicular to the putatively fitted line. So the error components are much messier than for regression calculations. The 2D case is simple and the result is the so-called "standard deviation line" (in between the x-on-y and y-on-x regression lines). The 3D case yields horrendous equations and it is not clear to me that they boil down to the s.d.-line. But now I want to fit cubic splines - just calculating the components of the error function (distance from curve to data-point, squared) will require finding the appropriate root of a 6-degree polynomial. That method is going nowhere! There is a simpler way, right? Cheers, Mark p.s. Please communicate by e-mail. Thanks in advance for any help.Return to Top
Bill Shipley and Lyne Labrecque wrote: > > Consider an X-Y graph (#1) in which both X and Y are REAL variables that can vary from > -infinity to +infinity. The area of this graph is therefore infinity*infinity > (...?...). Now, restrict Y to be greater than, or equal to, zero. Is the area of this > new graph (#2) 1/2infinity*infinity? What if Y is restricted to the range (1,10). What > is the area of this third graph? (10-1)*infinity? What is the ratio of the area of this > plane divided by the area of the first plane? Is it: > 9*infinity/(infinity*infinity)=9/infinity? > What is the area of graph #3/graph#2? Is it: > 9*infinity/(0.5*infinity*infinity)=18/infinity? > If all of this is true then does it follow that (18/infinity) is twice as large as > (9/infinity)? > A reference would be appreciated. Any multiples of infinity are also equal to infinity. These values of infinity cannot be compared. Comparing values of infinity only result in undefined results. Michael Woodall Mathematics Teacher, Montreal -- Woody!Return to Top
A student from a country in the Caribbean has applied for admission for the undergraduate program here at the U. of Windsor. He has submitted a document indicating that he has received a Royal Statistical Society (London) Higher Certificate, which apparently can be obtained by passing a set of exams. The topics are fairly standard - statistical analysis, inference, nonparametrics, etc. Presumably these are at a lower undergraduate level. Can anyone give a more precise indication of the level of this certificate? Is it considered equivalent to [a] course[s] in any university? Or, can anyone give me a contact so I can find out (preferably am e-mail contact)? Thanks, Myron Hlynka, Dept. of Math & Stat., U. of Windsor, Windsor, Ontario, Canada -- Myron Hlynka Dept. of Math. & Stat. University of Windsor Windsor, Ontario, CanadaReturn to Top
Subject: Faculty Positions Annoucement The Department of Forestry, National Taiwan University, is seeking applicants with solid academic training for two positions at lecturer (assistant professor) level in the following two fields: (1) Environmental Monitoring and Planning (2) Resources Inventory MINIMUM QUALIFICATIONS: 1. Ph.D., and at least one degree (BS, MS, Ph.D.) obtained from a forest resources related program. 2. fluency in spoken and written Chinese. APPLICATIONS: Applications should include: (1) a curriculum vitae, (2) transcripts (undergraduate and graduate), (3) a copy of the applicant's Ph.D. dissertation, (4) copies of published research papers, (5) two letters of recommendation and (6) an indication of which field the applicant is applying. The closing date is December 12, 1996. CONTACT: Professor Ya-Nan Wang (Chairwoman) Department of Forestry, National Taiwan University #1, Section 4, Roosevelt Rd., Taipei, Taiwan 106, R.O.C. Phone: +886-2-3633352, Fax: +886-2-3654520 E-mail: m627@ccms.ntu.edu.twReturn to Top
Hi everybody, I have a problem and I was wondering if anybody could help me. I have three variables (a, b, c), and I know the variances and covariances between them (i.e. var(a), var(b), var(c), covar(a,b), covar(a,c), covar (b,c)). I need to create a new variable d=b+c, and by the theory of error propagation I can find var(d), but I need to know cov(a,d) also and I don't know how to find it. Any help or reference will be apreciated. -- Jordi Riu Rusell tel.: 34-(9)-77-558187 Departament de Quimica fax.: 34-(9)-77-559563 Universitat Rovira i Virgili e-mail: rusell@quimica.urv.es Pl. Imp. Tarraco, 1 43005-Tarragona Catalonia - SpainReturn to Top
Hello all, Once again I am seeking some knowledge and/or a reference to confirm a rumor. I am conducting an experiment with the following possible outcomes: {-11, -9, -7, -5, ..., 9, 11} (odd numbers between -12 and 12.) I will perform the experiment approximately 5000 times. The result of any one experiment does not affect the results of any later experiments. The goal is to find the expectation. I am rather certain the expectation is between -0.2 and +0.2, and the standard deviation is about 3.5. I have always been under the impression that the best estimator for the expectation was the average of a sample for this type of experiment. With 5000 trials I would expect the average to be accurate to within 2 * 3.5 / Sqrt(5000) <= 0.1, 95% of the time. About a month ago I sat in on a job interview. During the interview the applicant mentioned that there was a better estimator available for this kind of experiment. I didn't believe him at the time, but then he mentioned a paper that he had written on the subject, so I took his word for it. I really wish I had written down the paper's title because a better estimator would be very useful for me. So the questions is "Is there a better estimate for the expectation than the average?" If so, a reference would be appreciated. As you might guess from my post, I don't have a background in statistics, but I can read mathematical papers, so a technical reference is fine. Thanks for any help. Hein HundalReturn to Top