![]() |
![]() |
Back |
I'm having the statistical equivalent of writer's block. Can someone PLEASE remind me what the probability density function is for a simple sine wave? It's that U-shaped thingy -- you know what I mean. Thanks! --- DickReturn to Top
Most commercial Design of Experiments packages can deal with this problem. ECHIP handles this kind of constraint, for example. -- Aaron -- Aaron J. Owens, Research Fellow Telephone Numbers: Modeling and Simulation Office (302) 695-7341 Engineering Research Laboratory FAX " 695-9658 DuPont Company, E320/201 Home (302) 733-7836 Wilmington, DE 19880-0320 Internet: owens@prism.es.dupont.com ----------------------------------------------------------------------------- Opinions expressed in this electronic message should *NOT* be taken to repre- sent the official view(s) of the DuPont Company. ANY OPINIONS EXPRESSED ARE THE PERSONAL VIEWS OF THE AUTHOR ONLY. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Return to Top
HWINC wrote: > > Dear Sci.Stat.Math Readers > > I need help to find the number of samples needed to measure > the mean of a fluctuating quantity (velocity of turbulent fluid): > > given: > > accuracy = e = 0.001 > confidence: sigma of 99% = 2.33 > RMS value = 0.01 > > number of samples = (2.33 x RMS / e)^2 > > > number of samples = (2.33 x RMS / e)^2 > > Is this correct? > Your responses and comments are appreciated > thank you, > Tony Falcone > > Please send reply to: > falcon@cooper.edu Tony, You need one minor alteration to your formula, in order to account for additional degrees of freedom necessary to reduce Type II errors below whatever risk tolerance threshhold you have. The formula I think you want to use is as follows: number of samples = (2.33 + U)^2 x (RMS / e)^2 where the added "U" factor is the standard normal deviate associated with the risk you are willing to take of rejecting a point as an "outlier" which is in fact within your 0.01 resolution requirement. You have only accounted for the Type-I risk of certifying a "bad" point as "good" (per your stated quality standards.) --- DickReturn to Top
On 22 Nov 1996 aacbrown@aol.com wrote: > Marks NesterReturn to Topin > writes: > > > The data tell you something whether or not you reject > > the null hypothesis, e.g. the data may tell you that the > > two groups are quite similar in their attitudes. > > If your null hypothesis is that two groups are identical, and you fail to > reject it, there are two possibilities. I agree. (Forgetting about validity of statistical assumptions and appropriateness of the test.) > The groups may actually be identical which is generally implausible > or you may not have enough data to distinguish them. Only the > second explanation allows you to ask for more money. Personally if I owned the purse strings I wouldn't want to fund any project whose sole aim was to determine if two groups have EXACTLY identical attitudes. ******************************************************** All readers who control budgets take note and save their money! ******************************************************** I also do not know how one can justify asking for more money on the basis of a hypothesis test alone. As far as I can see one would have to take a close look at the observed means, variances etc. Even though these may have been used in the original hypothesis test suddenly one is considering point/interval estimates etc. and this, I believe, is what one should have done in the first place instead of bothering with the hypothesis test. -------------------------------------------------------- All of the above is very interesting but it has not escaped my attention that my original question has not been answered. I said: It seems quite implausible that two different groups would share exactly the same attitudes. So why do a significance test? Who cares whether or not the two groups differ significantly? You said inter alia: It is often the case that the null hypothesis is implausible, but it is still a standard technique of classical inference. I said inter alia: Then perhaps classical inference is at fault. If the most plausible position is that the groups differ then why not make that your null hypothesis. Ask a statistician to design for you a test which has this as the "null" hypothesis and tell him/her that, on the basis of the test, you may be prepared to reject this new null hypothesis and concede that the groups really have IDENTICAL attitudes. The "inter alias" above relate to the mechanics of hypothesis testing but do not explain why, in the vast majority of cases, no sensible person would accept the null hypothesis yet the null hypothesis is still tested. Why not bypass the null hypothesis, proceed directly to point/interval estimates, then make your pronouncements: there are big differences between groups, or there are small differences between groups, or we can't really tell yet because of large variation, or whatever. Kind regards Marks R. Nester
If someone can help me on this problem, I really appreciated..... Y = ( 19x - 12 ) / (5x^2 - 15x) I need you to solve for x... for example Y = 3x then x = y/3 thank you again amukhtar@mail.bcpl.lib.md.usReturn to Top
In articleReturn to Top, Bill Simpson wrote: > Several people are mentioning the use of robust statistical methods. This > is essentially what I said. > > For example, using "robust regression" amounts to an assumption that the > errors have a double exponential (Laplace) distribution. (Since robust > regression minimizes the absolute errors, which is the way to get the MLE > for a model with Laplace distributed errors) > There are, in fact, many better robust regression methods than least absolute residuals. Some are maximum liklihood for some error model, but many are not. -- Paul Velleman
In market research it is not uncommon to assume that some types of ordinal data can be treated as interval. It seems to me that the test is whether the group male, female differ with respect to the question.. There are 2 approaches I would reccommend. One is a Chi test (if you are using a contgency table then you may need to decompose the chi if the predictor variable (independent) has more than 2 levels). A second approach is the use of the non-parametric test statistics of Mann-Whitney (though there are some issues that arise when ranks have multiple ties and data sets are small) I am unfamiliar with SAS but SPSS allows you to do all these tests under the various menu commands. In <56ioti$frb@www.umsmed.edu> WarrenReturn to Topwrites: > >Nick, > >Unless you have reason to believe the categories are equally spaced, I >would worry a little about numbering just as you said. And the general >test of association may not tell you what you need to know...how do the >proportions relate to each other on an ordinal scale. > >You have a couple of options, I would think. > >In PROC FREQ in SAS, you can select scores different from 1,2,3,4 (CMH), >but you don't seem to have any strong belief in how much distance there >is between categories. > >You could use ridits which assign scores that take into account the >number in each category...but I think we are back to numbering the >categories if I remember the ridit procedure...it does some kind of >midranking as I recall. > >You could use a straight ranking >procedure...like Kruskal-Wallis-Mann-Whitney. > >Another technique for analyzing these data (your example fits the classic >mold very well) is something called a "proportional odds model." It >would be the best choice if the assumptions are met. SAS will do this >type of model using PROC LOGISTIC and test the "PO" assumption. If the >proportional odds assumption isn't reasonable, you can do a "generalized >logits" model using PROC CATMOD...you could compare the "not at all" >group to all the rest. Agresti would be a good place to look for >references, but Agresti isn't a good "elementary" text...I haven't seen >his new book but it might be a little more "elementary". All of these >would require a little reading on your part concerning the assumptions. > > >n.w.nelson@education.leeds.ac.uk (nick nelson) wrote: >>Say I have responses on an attitude scale >> >>eg Do you like this? lots / some / a little / not at all >> >>and two groups eg men and women. What is the best way to >>establish whethere the two groups differ significantly? >> >>On approach I have seen involves numbering the responses 4,3,2,1 >>and working with the means, but this seems dubious due to the >>non-interval nature of the scale. >> >>Alternatively you could cast the reponses in a 4x2 table and do >>a chi2 on it, but this ignores the order information altogether. >> >>Is there a middle path? >> >>Nick. > >
Several people are mentioning the use of robust statistical methods. This is essentially what I said. For example, using "robust regression" amounts to an assumption that the errors have a double exponential (Laplace) distribution. (Since robust regression minimizes the absolute errors, which is the way to get the MLE for a model with Laplace distributed errors) Maybe another distribution for the errors is more plausible. Bill SimpsonReturn to Top
hi... anyone has any idea how to compute the centroid of 5 points in a 4 dimensional space ? For 2 points, it would be given by the mid-point of the 2 points. For 3 points, it would be given by the centroid of the triangle with the 3 points as vertex. How about 5 points ? Thanks. CKReturn to Top
C K Ang (engp6074@leonis.nus.sg) wrote: > anyone has any idea how to compute the centroid of 5 points in a > 4 dimensional space ? > For 2 points, it would be given by the mid-point of the 2 points. You mean the arthmentic mean of the two vectors. > For 3 points, it would be given by the centroid of the triangle with > the 3 points as vertex. You mean the arithmetic mean of the three vectors?! > How about 5 points ? An exercise for the reader.... :-) -- Dr Darren Wilkinson e-mail WWWReturn to Top
On 22 Nov 1996, Randall C. Poe wrote: > I'm reading a couple of statistics papers that have to do with nonparametric estimation > in a setting with correlated measurement errors. I find the model used for the > errors to be very strange, and I wonder if somebody could help me > understand it intuitively.Return to Top> > The errors e_i are taken to be samples from a stochastic process > Z(t) with autocovariance function g(t). The i-th error is > Z(a*x_i), where a is a scaling parameter. Large a corresponds > to e_i widely spaced in the stochastic process (covariance > g(a*(x_i - x_j)) and therefore not very highly correlated. > > Here's what I find peculiar: the parameter a also -> oo, at > rate either faster than n (a/n -> oo), slower than n (a/n -> 0) > or the same rate as n (a/n -> constant). The different rates > supposedly model different correlation regimes (termed > asymptotically independent, long-range correlation, and > short-range correlation). The idea of a sequence of models with n and a increasing is not intended to be taken at face value as a description of an actual experiment. In any real experiment a and n are both particular finite numbers, not sequences. The authors are trying to answer the question "What are the properties of this model for particular finite a and n if a is not small compared to n". As exact analysis of this problem is presumably infeasible they simplify the problem by saying "Let n be very large and a be much smaller than/same size as/much larger than n". In standard mathematical theory this requires the analysis of a sequence of models which have a and n increasing to infinity at the specified rates. The idea is that quantities proportional to, say, a/n can be ignored in the first case but not in the second or third case. Suppose your estimator has a bias which is roughly a/n. The usual theory for ARMA models assumes that terms like are small enough to be ignored, but if a and n are of comparable size in your data this may be untrue. A simpler example to understand comes from linear regression. Suppose we have n observations on p variables. We all know that if n is large and p is small the least squares estimators are good, but that if p is nearly as big as n the estimators may be very poor. If you wanted to have some idea of how big p could be compared to n you could study sequences of models in which, for example, p was proportional to n or to the square root of n or some such. Suppose you got consistent estimators when n was of order p^2. If you know from experience that in a certain setting with n=100 and p=10 you get good estimation you would be able to say that you should get similarly good estimation with n=400 and p=20 in the same sort of data. thomas lumley UW biostatistics
Does anyone have the FORTRAN (or C) code for best-subsets regression by leaps and bounds (Furnival & Wilson)? I have only been able to trace one of the authors, who doesn't have email. thomas lumley UW biostatisticsReturn to Top
Suppose X ~ N(mu, sigma^2). Then, what is the formula for E[X|X=>X^*] : Expected value of X given X is greater than or equal to some fixed number X^* Could anyone give me the reference? Thanks in advance. Tatsuo Ochiai tochiai@students.wisc.eduReturn to Top
engp6074@leonis.nus.sg (C K Ang) in <576gt4$jho@nuscc.nus.sg> > anyone has any idea how to compute the centroid > of 5 points in a 4 dimensional space ? I probably have misunderstood your question. But I think the answer is just to take the average of vectors position-by-position. That is the first number in your centroid vector is the average of the first numbers of your five given vectors. And so on for each of the four positions. Aaron C. Brown New York, NYReturn to Top
Return to Topin <01bbd8e8$2b3ab2a0$1c8e13cf@abdulhamidMukhtar> > Y = ( 19x - 12 ) / (5x^2 - 15x) I need you to solve for x... When giving a problem like this, it helps to have some context. Otherwise it looks as if it might be a homework problem. I'm guessing it isn't so: x = 1.5Y + 1.9/Y + (225Y^2 - 810Y + 361)^0.5/10Y Aaron C. Brown New York, NY
Marks NesterReturn to Topin writes: > Why not bypass the null hypothesis, proceed directly > to point/interval estimates There are many statisticians who agree with you. In particular, many Bayesians lean toward this point of view. However I think the solid majority of statisticians find classical hypothesis testing, which usually includes implausible null hypotheses, to be useful. It's really more a question of the reporting of results than of statistical technique. I can say that "a 95% confidence interval for x is 1 to 2" or "I reject the null hypothesis that x=0 at the 5% significance level." The first statement gives more information, the second is often more useful for communicating results. But if you don't like hypothesis testing, don't do it. Aaron C. Brown New York, NY
Hello, The following distribution is called Generalized Gaussian and is widely used in image processing: \begin{equation} f_{X}(x)=\left[\frac{\nu \eta (\nu ,\sigma )}{2\Gamma (1 / \nu)}\right] \exp (-[\eta (\nu ,\sigma ) |x|]^\nu ), \end{equation} where \begin{equation} \eta = \eta(\nu ,\sigma ) = {\sigma}^{-1} \left[\frac{\Gamma (3/\nu )}{\Gamma (1/\nu)}\right]^{1/2}. \end{equation} (sorry for this latex code) I need to generate random numbers for this distribution and evaluate first three moments of the distribution (conditioned on variable to be in some interval) on the computer. Could anybody help me with this? I will appreciate your advice and/or reference. Thank you, ---------------------------------------------------------------------------- _/ _/_/_/_/ _/_/_/_/ | 2259 Beckman Institute _/ _/ _/ _/ | University of Illinois _/ _/_/_/ _/_/_/_/ | 405 N. Mathews Ave. _/ _/ _/ | Urbana, IL 61801 _/ _/ _/ | | Igor Kozintsev | igor@ifp.uiuc.edu Graduate student | Ph. (217)-244-1089 Electrical and Computer Engineering | FAX: (217)-244-8371 Image Formation & Processing Group | http://www.ifp.uiuc.edu ----------------------------------------------------------------------------Return to Top
Bob Massey wrote: > [...] real computers can only compute a finite number > of cycling(rational) outputs even if run forever! Yeah. But such does not constitute a limit upon that which is computable (via machine, or otherwise). A machine that is designed so that it can "divide & conquer" can render such infinities irrelevant. Basically, such a machine transforms all problems into Geometry, and, instead of "algorithms", use cross-correlation among continuua to arrive at solutions. In such an approach, an infinity can be represented on a continuum... the "end points" are superfluous because the continuum represents "all there is of a particular 'quality'". Then calculation can "orient" within "all there is" within the "infinity" in question by simply "sliding" along the continuum. "Infinities" are easily dealt with in this way. Real solutions with respect to such are possible in short computational periods. There is one "caveat", however. "Solutions" that are produced are only as accurate as are the correlations between the continuua and the "infinities" that they represent. This caveat is dealt with by imbuing such a computational system with "intelligence" with respect to such correlateions. That is, within every problem-solving activity that occurs within such a system, there exists a parallel continuum-correlation-with-infinity convergence dynamic. And when one looks, one finds that such constitutes the essence of that which is referred to as "intelligence". What's actually happening is that the system strives continuously to climb the gradient that is what's described by 2nd Thermo (WDB2T). Given a means to assess the continuum-infinity-correlations, accuracy can be increased to any desired degree, and reasonable accuracy can be converged upon within relatively-short time frames. And it's easy to imbue such systems with the capacity to "recognize" advantageous "tool" inventions, and to produce outputs that will tell of such tools. That is, to imbue the system with the ability to invent the tools it needs to increase the correlations... the machine would recognize the usefulness of lenses, for instance, because encounters with lenses would result in the useful increase of correlations... this'd lead to the invention of magnifying glasses, microscopes and telescopes. And so forth. Our nervous systems work in precisely this fashion. I've explored computer math using such a Geometrical approach, and know of no problems that it doesn't handle in straight-forward fashions. Such systems incorporate no "tree"-like structures. They "just" do their continuua-"sliding" cross-correlation stuff. Coincidences among continuua in a converged "state" form a the closest thing that this sort of system has to a "tree"-like structure. But such is not very "tree"-like because new continuua can be added continually, and each new continuum makes possible formerly non-existent "tree"-branching analogues. This means that the "tree"-like structure is capable of growing everywhere, This renders the conceptalization of "branching" nonsensical, hence "trees", and their finite "time" & "space" (physical memory), are rendered nonsensical, too. (Note, of course some physical memory is required, and the "resolving power" of a "continuum machine" will increase in some proportion to physical memory. But even small "continuum machines" can solve seemingly-"intractable" problems.) [I apologize if this msg seems a bit "cryptic". I'm exhausted, and close to accepting that no one will ever be able to come and play with me. So why bother to "translate"? What's here is here because I'm obliged to make it available.] ken collinsReturn to Top
TatsuoReturn to Topasked: >Suppose X ~ N(mu, sigma^2). Then, what is the formula for > E[X|X=>X^*] : Expected value of X given X is greater than or >equal to some fixed number X^* you can find the expected value by evaluating the integral of x*p(x) from -mu/sigma^2 to oo and multiplying the result with sigma^2 p(x) is the pdf of N(0,1) and is equal to exp(-x*x/2)*(2*pi)^-0.5 BTW the indefinite integral of x*p(x) in this case evaluates to -p(x) The book that I found most thorough in explaining Expected values and introductory statistics in general was : "Statistical Theory and Methodology in Science and Engineering" by K.A.Brownlee, 1960 In my opinion this book beats any modern intro statistics book hands-down. Page 38 is on expected values. I had asked a similar question in this newsgroup and it took me a good week to justify the answer I got but I finally did. Thanks again to Aaron Brown who gave me the initial help. Dimitri
Ilias Kastanas wrote: > > In article <329393FD.10CE@orci.com>, Bob MasseyReturn to Topwrote: > >kenneth paul collins wrote: > >> > >> ..... . All such attempts can be disproven by presenting the > >> system with something that "breaks" the syntax. (This is also my main > >> objection to Goedel's "Incompleteness".) > > Side note: I didn't follow this exchange, but "breaking the syntax" > is irrelevant to Goedel Incompleteness. By "syntax" I was referring to the "rules" of the "proof". I stand on what I posted. ken collins [snipped Falsely-attributed stuff and discussion directed to others]
kenneth paul collins (KPCollins@postoffice.worldnet.att.net) wrote: : A machine that is designed so that it can "divide & conquer" : can render such infinities irrelevant. Basically, such a : machine transforms all problems into Geometry, and, instead : of "algorithms", use cross-correlation among continuua to : arrive at solutions. [tons snipped] Sounds very facinating ken, can you point me to a good reference on this stuff? Perhaps a journal paper... --jono -- "The human mind is a dangerous plaything, boys. When it's used for evil, watch out! But when it's used for good, then things are much nicer." -- The TickReturn to Top
I made a mistake in my previous post. When I typed sigma^2 I meant sigma (the standard deviation) So the corrected post should read like this : >you can find the expected value by evaluating the integral of x*p(x) from > -mu/sigma to oo and multiplying the result with sigma >p(x) is the pdf of N(0,1) and is equal to exp(-x*x/2)*(2*pi)^-0.5 > >BTW the indefinite integral of x*p(x) in this case evaluates to -p(x) >Return to Top