Subject: Re: native speakers - is it "a REML" or "an REML approach"?
From: nichols@spss.com (David Nichols)
Date: 13 Nov 1996 15:43:59 GMT
In article <3288D90D.41AE@kodak.com>, Paige Miller wrote:
>David Ronis wrote:
>> =
>
>> Hans-Peter Piepho wrote:
>> >
>> > Native speakers of the English language among stat-l members:
>> >
>> > Is it
>> >
>> > a REML (procedure, approach or whatever)
>> > =AC
>> > or
>> >
>> > an REML (procedure, approach or whatever) ?
>> > =AC=AC
>> >
>> =
>
>> I and several other native English speakers agree with the technical
>> editor that it should be an REML. Our reasoning is that the letter R as
>> pronounced in an abbreviation (rather than in a word) starts with a
>> vowel sound -- even though R is a consonant.
>
>But I have never heard it pronounced "are ee em el". The only pronunciation=
> I =
>
>have ever heard is "rem ul", which would be preceded by "a" and not "an".
>
>Perhaps a Bayesian instead of frequentist solution is required.
>
>-- =
>
>+---------------------------------+-----------------------------+
>| Paige Miller, Eastman Kodak Co. | =93I hate a country witout a |
>| PaigeM@kodak.com | derrick=94 -- Mark Twain |
>+---------------------------------+-----------------------------+
>| The opinions expressed herein do not necessarily reflect the |
>| views of the Eastman Kodak Company. |
>+---------------------------------------------------------------+
That's funny, I hear it pronounced are/ee/em/el much more often then
remmel. I think the bottom line here is that there is no one correct
answer; it depends on your preferences.
--
-----------------------------------------------------------------------------
David Nichols Senior Support Statistician SPSS, Inc.
Phone: (312) 329-3684 Internet: nichols@spss.com Fax: (312) 329-3668
-----------------------------------------------------------------------------
Subject: local monitoring tool
From: k43@ix.urz.uni-heidelberg.de (Jan Schuhmacher)
Date: 13 Nov 1996 15:14:54 GMT
I intend to analyze user profiles with individual data rather than those
inappropiate aggregate data collected from server stats. For this purpose I'm
looking for a *local monitoring tool*, which can be installed on the PCs of
potential respondents. This tool is supposed to collect and save the
following data:
(1) type of application (Gopher, News, WWW, Mail...);
(2) duration of the connection;
(3) IP-address of the contacted server (or the URL);
(4) local time;
(5) graphic elements;
(6) volume
Especially if two- or multidirectional communication-services are applied
(Mail, IRC, News, MUDs...) the tool is supposed to automatically launch
routines that help gathering informati-ons about the interacting partner(s)
and the type and purpose of interaction (full text protocols are considered
to be problematic).
?1) Does anyone know of tools like those? (I'd be glad to get some URLs)
?2) Is anyone interested in developing a tool like this (maybe only for
www-monitoring on the grunds of the mosaic history-function) or knows of
someone who is able and willing to do so?
Thanks in advance
..........................................................................
Jan W. Schuhmacher M.A. Bergheimer Strasse 35
E-Mail: jan.schuhmacher@urz.uni-heidelberg.de 69115 Heidelberg
Voice: +(049)6221 184729 Germany
..........................................................................
Subject: Re: transformation and heteroscedasticity
From: Rick McFarland
Date: Wed, 13 Nov 1996 18:25:53 GMT
Kiho Kim wrote:
>
> To anyone:
> In a multy-way anova, at what level does one need to to test for
> heteroscedasticity? For instance, consider a two-factor anova (eg.
> ETHNICITY and SEX) with two treatments groups (eg. straight or curly hair)
> and 10 replicate measurements within each group. Should I be testing for
> heteroscedasticity within each of the replicate groups or only or within
> the highest levels (ie. Ethnicity and Sex)?
>
> Any insight would be helpful
> Thanx, Kiho
Kiho,
Your model is: Yij=mu + SEXi + ETHNICITYj + Eij
The assumption is Eij ~ NID(0, sigma). Therefore,
you would need to check that the (theoretically) resuduals, eij,
(ETHNICITY*SEX) pass the test of homoskedasticity. If this
passes, it implies that that there is homoskedasticity within
each of the individual replicate groups (think of the block-breakup
of the covariance matrix). However,
in a practical sense, since you are probably using
Bartlett's test to check for homoskedasticisity (which
does not depend upon mu, you can use the Yij's instead of
the eij's for this test. Be sure to check for nomality of the residual
FIRST however or else Bartlett's test may be failing from
lack of normality rather than heteroskedaciirthsklgfhjs (I hate
typing that word!).
----"AT LAST I HAVE CONTROL OF YOUR TV SET..."-------
/ R i c k M c F a r l a n d /
/ hrm3c@VIRGINIA.edu /
/ http://www.stat.virginia.edu/~hrm3c /
/ Department of Statistics - University of Virginia /
--------------------(804)924-3066------------------
Subject: Re: transformation and heteroscedasticity
From: Hans-Peter Piepho
Date: Wed, 13 Nov 1996 17:58:08 +0100
>To anyone:
>In a multy-way anova, at what level does one need to to test for
>heteroscedasticity? For instance, consider a two-factor anova (eg.
>ETHNICITY and SEX) with two treatments groups (eg. straight or curly hair)
>and 10 replicate measurements within each group. Should I be testing for
>heteroscedasticity within each of the replicate groups or only or within
>the highest levels (ie. Ethnicity and Sex)?
>
>Any insight would be helpful
>Thanx, Kiho
>
>
If I get it right, you have a two by two factorial set-up, which can be
visualized by a two-by-two table with 10 replications within each cell. What
I think you should test for, is whether the variances within cells differ
among cells, i.e. you should test
H_0: Var_11 = Var_12 = Var_21 = Var_22
where Var_ij is the variance in cell ij (row i, col j) of the two-way table
below
Ethnicity
straight hair curly hair
Sex male Var_11 Var_12
female Var_21 Var_22
BTW: Do not use the Bartlett-test. It is extremely sensitive to departures
from the normality assumption. A better test is the Levene-test.
Hans-Peter
_______________________________________________________________________
Hans-Peter Piepho
Institut f. Nutzpflanzenkunde WWW: http://www.wiz.uni-kassel.de/fts/
Universitaet Kassel Mail: piepho@wiz.uni-kassel.de
Steinstrasse 19 Fax: +49 5542 98 1230
37213 Witzenhausen, Germany Phone: +49 5542 98 1248
Subject: Re : Exact Confidence Intervals
From: Norman Marsh
Date: Wed, 13 Nov 1996 18:19:24 +0000
> Date: Mon, 11 Nov 1996 16:22:39 +0000
> From: "Bego~a Campos"
> Subject: exact confidence intervals
>
> Hi everybody, I need help for the following problem: the other day a
> colleague of mine told me "What about the exact confidence interval
> for the difference of porportions?. We looked up textbooks like Agresti
> (1990), Fleiss (1981), Hahn &Mecker; (1991), and manuals like that for
> STATXACT and found always the same: the exact confidence interval for
> a proportion, not even a word for the difference. We know we are
> treating with the difference of Binomials with equal n (1) and
> different p's hence we can not apply the addition property. What is
> the resulting distribution?. Has it any sense to explore this
> approach?.
> Thanks in advance.
> Begona Campos
> Biostatistics Unit
> School of Medicine - University of Barcelona
>
Whereas StatXact earlier versions may not do this, in StatXact Version
3 for Windows there are options for exact confidence intervals on the
difference of two binomials (two alternative statistics are considered).
These facilities are made available, reasonably, under a sub-menu of
'Fisher Exact Test'.
Before I send the enquirer to spend money on this excellent but expensive
software, I should add that I don't understand the reference to 'different
n (1') so fear slightly that I could have misunderstood the requirement.
So the enquirer should please feel free to contact me directly (via
address below) if this advice seems to need further discussion.
Norman Marsh
University of Liverpool
jw34@liverpool.ac.uk
Subject: Kolmogorov-Smirnov-type test for k-samples of unequal size
From: Susan Durham
Date: Wed, 13 Nov 1996 11:41:00 -0600
I am looking for a Kolmogorov-Smirnov-type test for k-samples of unequal
sizes. For the problem at hand, k is 3, 4 or 5. Sizes are _really_
unequal, and while some are large (100+), some are small (<10). I'd
appreciate any helpful pointers.
My thanks, in advance,
Susan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Susan Durham
Statistical Consultant
Utah State University
Department of Fisheries and Wildlife
5210 University Blvd
Logan, UT 84322-5210
(801) 797-1337
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Subject: Re: Sudaan vs SAS, SPSS, etc
From: mserafin@chinook.halcyon.com (Mark A. Serafin)
Date: 13 Nov 1996 21:27:16 GMT
Rick Engberg (eng@med.pitt.edu) wrote:
: To it's credit, Sudaan handles quite a few complex sample designs. It is
: capable of performing complex modeling precedures like linear and
logistic : regression. What seems like a big selling point to me is I've
never heard of : any other package for analysis of data obtained via
non-SRS. Also, they were : very nice about working with us on the price of
the software since we had a : limited budget and needed Sudaan to perform
our analysis.
Actually, EpiInfo v.6 handles complex survey design, including
clustering, stratification and weighting. It doesn't do the kinds of
complex analyses that SUDAAN can do, but it works quite well for point
estimates and confidence intervals. It uses the Taylor Linearized
Deviation approach for calculating variances. It was developed by WHO and
the CDC for field workers doing epidemiology in developing nations.
Comparisons of results from Epi6 and SUDAAN agree within 1/10 of 1%.
Best of all, it's free (though the manual costs about $50). It
can be had from the CDC home page (www.cdc.gov) or from USD, Inc., 2075-A
W. Park Place, Stone Mt., GA 30087.
--
Mark Serafin | "Reality must take precedence over public
I speak only for myself | relations. Nature cannot be fooled."
| - Richard Feynman
Subject: Non-parametric decomposition of a time-varying histogram into a few components
From: "Thomas L. Bell"
Date: Wed, 13 Nov 1996 18:07:11 -0500
Hi,
I have data in the form of a time-dependent histogram n_i(t). For
example, the gray-levels of pixels in a digital black-and-white image
can be histogrammed as n_i, where i indexes the gray-level bins and n_i
is the number of pixels with gray-levels in bin i. A digital
black-and-white movie would produce a time-varying histogram of pixel
brightnesses.
I would like to describe the histograms as a sum of component
distributions with time varying coefficients; that is,
n_i(t) = m1(t) p1_i + m2(t) p2_i + m3(t) p3(t) + ...
The component distributions p1, p2, ... are normalized and
non-negative. Ideally, only a few components can describe the data
satisfactorily. The interpretation would be that the images consist of
a mixture of a few types (dogs, cars, trees, ...). Each type has a
characteristic (noisy) distribution of gray-levels associated with it.
The coefficients m1(t), m2(t), ..., give the number of pixels in each
image involved in displaying type #1, type #2, ....
I don't want to specify a priori a set of parametric distributions p1,
p2, ... whose parameters are adjusted to fit the histograms as well as
possible. (For example, I don't want to assume the histograms are
mixtures of a few gaussian distributions with unknown means.) I would
like the component distributions p1, p2, ... themselves to be entirely
determined by the data.
This is somewhat like the principle component problem in multivariate
statistics, but these "principle components" are constrained to be
non-negative, they are not orthogonal and may not be unique.
Can anyone point me in the direction of literature on this type of
problem? I am willing to assume that the maximum likelihood solution
leads to minimization of the chi-squared measure of the difference
between a histogram and its fit. I know a simple-minded approximate
solution of the problem, but I feel certain there must be something in
the statistics or image-analysis literature on this topic. I was unable
to discover anything prowling around the library, though.
Any suggestions would be very much appreciated!
Thanks in advance.
Tom Bell
Subject: Re: SUDAAN & STATA for clustered data
From: William Gould
Date: Wed, 13 Nov 1996 17:39:28 -0600
John Rogers recently wrote that Stata does not handle
weights properly for survey data. Since I think that we are one of the few
packages that does handle survey data correctly (I am president of StataCorp),
I want to respond to that. John also complained about his experiences with
our technical support and, for that, I want to apologize. I don't know what
happened, but it is my job to ensure that John's experiences are not typical
and, mostly, I do a pretty good job.
> The weighting procedures in Stata require that non-integer weights
> sum to the sample size.
Stata has four types of weights:
1) aweights: these are automatically normalized to sum to the sample
size.
2) fweights: frequency weights
3) pweights: probability (sampling) weights; these weights are
unnormalized and give the right standard errors for survey
sampled data.
4) iweights: unnormalized weights
John's statement likely refers to aweights (in which case the weights are
automatically normalized to sum to sample size).
In earlier releases of Stata our basic regression command took aweights,
fweights, and iweights. In those releases we had an alternate command,
-hreg-, that handled pweights and I think that was what John was looking for.
-hreg- gives proper estimates for unstratified survey data.
In the current release of Stata, -hreg- has been merged back into the basic
Stata -regress- command and -regress- now takes aweights, fweights, iweights,
and pweights. We have also added a new suite of commands that are designed
especially for complex survey data.
These survey commands handle stratification and clustering and produce
variance estimates using the standard Taylor series linearization method
(as does SUDAAN). Our procedures give the same answers as those in SUDAAN.
> SPSS handles the weights properly.
SPSS is a fine and well respected package. On this score, however, as far as
I know, SPSS does not have the capability to produce design-based standard
errors for complex survey data, nor do they make any claims to providing
such features.
-- William Gould
wgould@stata.com
http://www.stata.com
Subject: Re: Signal-Noise Ratio?
From: Louis Lavallee
Date: Wed, 13 Nov 1996 21:31:35 GMT
In Article<55vpug$rrf@lastactionhero.rs.itd.umich.edu>, writes:
> Path: rocksanne!parc!biosci!news.Stanford.EDU!su-news-hub1.bbnplanet.com!news.bbnplanet.com!cpk-news-hub1.bbnplanet.com!www.nntp.primenet.com!nntp.primenet.com!dispatch.news.demon.net!demon!news.good.net!news.good.net!news.id.net!news.cic.net!news.itd.umich.edu!sunm4048az.sph.umich.edu!bdecicco
> From: bdecicco@sunm4048az.sph.umich.edu (Barry DeCicco)
> Newsgroups: sci.stat.consult
> Subject: Re: Signal-Noise Ratio?
> Date: 8 Nov 1996 17:17:04 GMT
> Organization: University of Michigan
> Lines: 49
> Sender: bdecicco@sph.umich.edu (Barry DeCicco)
> Message-ID: <55vpug$rrf@lastactionhero.rs.itd.umich.edu>
> References: <54okr2$ivi@alumni.umbc.edu> <550318$46q@lastactionhero.rs.itd.umich.edu>
> NNTP-Posting-Host: sunm4048az.sph.umich.edu
>
> In article , Louis Lavallee writes:
> |>
> |>
> |> Barry,
> |>
> |> I have a somewhat different opinion of the use of signal to noise ratio in
> |> rank ordering data. If the objective is to simultaneously amplify the power
> |> of the signal (to change the output) while reducing the power of the noise (to
> |> affect the output), it is just fine. To do these separately usually creates
> |> conflict, since the minimum variance condition is usually the one leading to
> |> smallest output, e.g. turn off the signal. All this assumes you are measuring
> |> something related to the basic function you are trying to improve.
> |>
> |> Louis
> |>
>
>
> I would expect a conflict unless I was very lucky.
> The point is, to measure the trade-offs, so that you can
> make them. The 'classical Taguchi' method was to minimize
> the S-N ratio, and to assume that the 'optimal' setting
> of the control factors was that which did this.
>
> The alternate approach is to measure the effects of the control
> factors, on both the mean and the variance. Then, pick
> a control level combination which gives you what you want,
> within the limits that you have.
>
> Barry,
If I am not mistaken, the idea Taguchi had was to make
effective/efficient use of input power, without creating lots of variation and
side effect, i.e. To find a combination of control factors to redistribute the
power in such a way as to amplify what was supposed to happen and dampen what
was not supposed to happen. Treat all engineering systems as transformation
of power from one form to another ( or one location to another). After
redistribution of power, by changing nominal control factor setpoints, then
tuning was done to hit whatever targets were desired, without upsetting the
relationship between sensitivity to signal and sensitivity to noise. I don't
believe there is any assumption about optimal settings. Verification tests
were usually conducted to check reproducibility of results under predicted
downstream conditions.
>
>
>
>
>
>
>
>
Subject: Employment: South Africa, Remote Sensing Researchers
From: chris@bayes.agric.za (Christopher Gordon)
Date: 14 Nov 1996 12:41:28 GMT
AGRICULTURAL RESEARCH COUNCIL of SOUTH AFRICA
INSTITUTE FOR SOIL CLIMATE AND WATER
REMOTE SENSING DIVISION
The following positions are now on offer at this Pretoria, South
Africa based Institute with its well equipped digital image
processing facility.
The successful candidates will form part of a team of 12
researchers and support staff specializing in Remote Sensing.
Three persons are required to research the development and
application of Remote Sensing Techniques for obtaining
Environmental and Agricultural Resource Information and
Statistics.
In addition to the educational requirements set for each
position, a relevant post graduate qualification and/or
experience in Remote Sensing/Digital Image Processing and GIS
will serve as a strong recommendation in each instance.
The specific requirements for each position are as follows:
Post 1 Rangeland Applications: A university degree in Ecology,
Botany, Rangelands Science or related fields.
Post 2 RADAR Applications: A university degree in Physics,
Applied Mathematics, Statistics, Engineering or a related field.
Post 3: A university degree in Natural, Earth or Pure Science or
related field (Soil Science, Geography, Botany, Geology,
Environmental Studies)
Applicants for all posts may be required to undertake
psychometric tests.
The ARC offers challenging opportunities in a pleasant work
environment as well as competitive remuneration packages,
including standard fringe benefits, which will be negotiated in
accordance with qualifications and experience.
Please forward your application together with CV to:
The Director:ISCW, P.Bag X79, Pretoria, 0001. (Fax --27 12 323
1157)
Applications close on 22 November 1996
Enquiries:
Dr JF Eloff / Mr TS Newby ph (--27 12) 326 4205
E-Mail : TERRY@IGKW2.AGRIC.ZA
Subject: Re: native speakers - is it "a REML" or "an REML approach"?
From: T.Moore@massey.ac.nz (Terry Moore)
Date: 14 Nov 1996 04:17:10 GMT
In article <3288CE7F.1CCA@kodak.com>, Paige Miller
wrote:
> According to my dictionary (Webster's Seventh New Collegiate Dictionary),
the
> word REML does not qualify to have "an" in front of it. The only words that
> qualify are words that begin with vowel, with a silent "h" (such as "honor")
or
> words that begin with h that is part of an unstressed syllable (so, "an
> hypothesis" is correct, but "an homerun" is not).
That shows what happens when grammarians interfere with the
natural evolution of language.
The purpose of the "n" is to avoid having to make two separate
vowel sounds together. "An hypothesis" is hard to say unless
you drop the "h".
If you say "REML" as a word then "a REML" is fine.
If you say it as "arr-ee-em-el" what's wrong with "an REML"?
Just try saying "a arr-ee-em-el". How easy did you find it?
Terry Moore, Statistics Department, Massey University, New Zealand.
Imagine a person with a gift of ridicule [He might say] First that a
negative quantity has no logarithm; secondly that a negative quantity has
no square root; thirdly that the first non-existent is to the second as the
circumference of a circle is to the diameter. Augustus de Morgan
Subject: JOB - Eastern PA, Pharmaceutical
From: "Matthew C. Hutcheson"
Date: Thu, 14 Nov 1996 09:30:12 -0400
Statistician: ProMetrics Consulting, Inc. (www.prometrics.com) seeks two
highly motivated MS or PhD statisticians to join our team. Positions
involve analyzing large datasets, developing models, forecasting sales
and research in the healthcare and pharmaceutical industry. Expertise
in categorical and exploratory data analysis preferred. Send résumé or
CV to Matthew C. Hutcheson, Senior Analyst, 985 Old Eagle School Road,
Suite 512, Wayne, PA 19087. Fax: (610) 995-2285.