Subject: Re: how to combine categorical var. with continuous var.?
From: Simo Virtanen
Date: Mon, 11 Nov 1996 10:04:58 -0800
Haiyi Xie wrote:
> Sorry for cross posting.
> I need advice for simple practical question: Based on theoretical reasoning, I
> need to combine 3 variables and form a sort of "scale" or composite variable,
> but 3 variables have different scales (dichotomy, ordinal and interval
> respectively). How do you combine those variables into one?
One good way that I was taught to use is to transform all variables
so that they have a minimum of 0 and maximum of 1, then calculate
the mean. This may not be perfect because equal weights are used but
the composite is easy to interpret since it has the same scale as
the individual items.
--
Simo V. Virtanen, Ph.D.
Finnish Institute of Occupational Health
Helsinki, FINLAND
Subject: summer and academic year visits
From: Mark Schervish
Date: Mon, 11 Nov 1996 08:43:43 -0500
***********************************************************************
* CARNEGIE MELLON ANNOUNCES NEW PROGRAM TO SUPPORT VISITS OF RECENT *
* PH.D. RECIPIENTS DURING EITHER SUMMER OR ACADEMIC YEAR *
***********************************************************************
The Department of Statistics at Carnegie Mellon University, with
partial support from the National Science Foundation, has established
an Institute for Statistics and its Applications.
The purpose of the Institute is to foster the development of
statistical methodology through vigorous cross-disciplinary
collaborations, and to train pre-doctoral and post-doctoral
statisticians in cross-disciplinary research and teaching. All
membersof the Department are affiliated with the Institute. Recently
the Department appointed four post-doctoral fellows in the Institute.
Currently the research collaborations in the Institute include the
following subject matter areas: cognitive psychology; functional
magnetic resonance imaging; genetics; psychiatric statistics;
statistical physics; criminology; governmental statistics and public
policy; environmental statistics; and finance.
The Institute seeks applications for visiting positions from
statisticians with recent Ph.D. degrees who are interested in
cross-disciplinary research and teaching. Research support is also
available to selected recent Ph.D. recipients who wish to visit the
Institute during the summer. The length of all appointments is
negotiable. Please send vita, relevant transcripts, research papers,
three letters of recommendation to
Chair, Faculty Search Committee
Department of Statistics
Carnegie Mellon Univeristy
Pittsburgh, PA 15213
Women and minorities are especially encouraged to apply. Carnegie Mellon
University is an affirmative action/equal opportunity employer.
Subject: sample size for nonparametric test
From: muenchp@xanth.CS.ORST.EDU (Pornsiri Muenchaisri)
Date: 11 Nov 1996 19:14:24 GMT
Hi,
I like to know how small a sample size can be
for a two samples nonparametric test, eg, Mann-Whitney-Wilconxon test.
I will conduct two case studies on comparing
two software design/development methodologies.
One is my new method and the other is a conventional one.
My dependent variables are number of errors for the first case study and
development time for the second one.
My subjects are volunteer students which may be hard to find because
1. the case study involves students to learn both methods,
2. it takes time to do problem solving tasks, and
3. I plan to perform retrospective analysis (protocol analysis) also.
It may take 3+ hours for each student for the whole session.
Jorgensen [1] suggests three subjects for protocol analysis, but
I don't know it can be applied for comparing two samples.
I would appreciate your suggestions and pointers.
Thank you.
Pornsiri
[1] Anker Helms Jorgensen, Using the Thinking-Aloud Method in
System Development, in Design and Using Human-Computer Interfaces
and Knowledge Based-Systems edited by G. Salvendy and M.J. Smith,
Elsevier Sceince Publishers B.V., Amsterdam, 1989.
Subject: SAS / STAT / Excellent Opportunites in PA/NJ/DE
From: Loren King
Date: Mon, 11 Nov 1996 14:24:53 -0800
Enjoy the flexibility of the consultant lifestyle
while benefiting from the stability of working with
an industry leader. Our clients consist primarily of
Fortune 500 companies excellent compensation
for Philadelphia area SAS opportunities.
We invite you to check out our web site at:
http://www.edpcs.com
Put Us To Work For You!
Code: 2183/SAS/1111
Duration: 4 Months
Position: Senior Programmer / Analyst
Duties: Work on Phase IV and V Clinical Trials
Requires: SAS, Clinical Trials and VAX experience
Location: Northwest Philadelphia Suburbs
Salary: $30-40 per hour / commensurate with experience
Note: Position will become permanent after 90 days
-----------------------------------------------------------------
Code: 2174/SAS/1111
Duration: 3 Months
Position: Programmer / Analyst
Duties: Work on pedigree engineering certification
Requires: SAS and SAS/Graph experience
Location: Northwest Philadelphia Suburbs
Salary: $30-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2172/SAS/1111
Duration: 12 Months
Position: Senior Programmer / Analyst
Requires: SAS
Location: Northwest Philadelphia Suburbs
Salary: $35-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2133/SAS/1111
Duration: 7 Months
Position: Senior Programmer / Analyst
Duties: Update, develop and make enhancements to
current Oracle system. Develop system to
interface with current arrow diagnostic system,
auto encoding system and dictionary system
Requires: SAS, Oracle and PL/SQL experience
Location: Central New Jersey
Salary: $35-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2130/SAS/1111
Duration: 6 Months
Position: SAS Programmer
Duties: Develop tools for user-friendly access
to large diverse databases currently
stored on IBM mainframe used for
epidemiological and outcomes research.
Requires: SAS, SAS/AF and IBM Mainframe
Desired: Systems background experience
Location: Northwest Philadelphia Suburbs
Salary: $35-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2115/SAS/1111
Duration: 6 Months
Position: SAS Programmer
Duties: Develop Phase II & III Clinical
Trials reporting applications
Requires: SAS, VMS and Clinical Trials
Desired: Systems background experience
Location: Western Philadelphia Suburbs
Salary: $35-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2105/SAS/1111
Duration: 12 Months
Position: Senior Programmer / Analyst
Duties: Perform approaches to measuring Quality of Life
Requires: Experience in Pharmacoeconomics as well
as SAS Stat, SAS Graph and Clinical Trials
Location: Central New Jersey
Salary: $35-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2080/SAS/1111
Duration: 3 Months
Position: Programmer / Analyst
Duties: Perform validation and enhancements
to existing clinical reports
Requires: Strong SAS skills
Desired: VMS experience
Location: Northwest Philadelphia suburbs
Salary: $35-45 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2072/SAS/1111
Duration: 3 Months
Position: Programmer / Analyst
Duties: Perform enhancements to clinical reporting
applications using SAS Base and SAS
Macro on a VAX/VMS platform
Requires: SAS and Clinical Trials experience
Location: Northwest Philadelphia suburbs
Salary: $25-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2046/SAS/1111
Duration: 2 Months
Position: Technical Consultant
Duties: PH CLINICAL software installation and support
Requires: VMS and Clinical Trials experience.
Location: Western Philadelphia Suburbs
Salary: $30-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 2001/SAS/1111
Duration: 12 Months
Position: Programmer / Analyst
Requires: Strong BASE SAS background and a
working knowledge of MS Office
Desired: SAS Macro and AF are desired
Location: Northwest Philadelphia Suburbs
Salary: $25-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1931/SAS/1111
Duration: 12 Months
Position: Analyst
Duties: Support the clinical reporting
teams for drug study applications
Requires: SAS, Oracle and VMS experience
Location: Northwest Philadelphia Suburbs
Salary: $25-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1900/SAS/1111
Duration: 6 Months
Position: Programmer / Analyst
Duties: Work in a clinical trials environment
Requires: SAS, IBM Mainframe and JCL experience
Location: Northwest Philadelphia Suburbs
Salary: $25-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1881/SAS/1111
Duration: 6 Months
Position: Statistician
Duties: Participate in the development of a Phase
Three cardiovascular clinical application
Requires: Pharmaceutical experience along with a
strong statistical analysis background
Location: Western Philadelphia Suburbs
Salary: $35-45 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1824/SAS/1111
Duration: 9 Months
Position: Programmer / Analyst
Duties: Programming for Phase III Clinical Trials
Requires: Phase II & III Clinical Trials safety &
efficacy data on VAX/VMS
Desired: SAS/BASE & Macros desired
Location: Central New Jersey
Salary: $30-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1823/SAS/1111
Duration: 6 Months
Position: Senior Programmer / Analyst
Duties: Analysis, coding & data validation
support for sales & marketing applications
Requires: SAS, SQL, Oracle, Unix
Location: Central New Jersey
Salary: $30-40 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1779/SAS/1111
Duration: 6 Months
Position: Programmer / Analyst
Duties: Develop clinical applications under VAX/VMS
Requires: SAS, VAX/VMS and clinical trials experience
Location: Northwest Philadelphia suburbs
Salary: $30-40 per hour / commensurate with experience
Note: Excellent chance for performance based extension.
-----------------------------------------------------------------
Code: 1776/SAS/1111
Duration: 6 Months
Position: Statistician
Duties: Phase 2 and 3 clinical trials applications
Requires: SAS experience
Location: Northwest Philadelphia suburbs
Salary: $35-45 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1706/SAS/1111
Duration: 6 Months
Position: Senior Programmer / Analyst
Duties: Participate in the development
of Clinical Trials applications
Requires: SAS, VMS and Clinical Trials experience
Location: Northwest Philadelphia suburbs
Salary: $30-40 per hour / commensurate with experience
Note: Excellent chance for performance based extension.
-----------------------------------------------------------------
Code: 1635/SAS/1111
Duration: 6 Months
Position: Programmer / Analyst
Duties: Load data from ASCII, Btrieve or
MS Access files into SAS files
Requires: SAS BASE & MACRO experience
Location: Wilmington, Delaware
Salary: $25-35 per hour / commensurate with experience
-----------------------------------------------------------------
Code: 1459/SAS/1111
Duration: 6 Months
Position: Senior Programmer / Analyst
Duties: Load data from ASCII, Btrieve or
MS Access files into SAS files
Requires: Minimum of 3-4 years of SAS, clinical,
pharmaceutical or related experience
Location: Central New Jersey
Salary: $30-40 per hour / commensurate with experience
-----------------------------------------------------------------
The Southeast Pennsylvania / Tri-State area has the
highest demand for SAS professionals in the country,
and EDP is the leading provider for professional DP
contractors in the area. We are seeking talented
individuals with paid/professional experience, residing
in U.S., for additional SAS opportunities. U.S.
Citizenship, Green Card, F-1 or TN eligibility required.
Please forward your resume to:
Loren King, Technical Recruiting
EDP Contract Services
401 City Ave., Suite 915
Bala Cynwyd, PA 19004
610 667-8735 (fax)
610 667-2990 (voice)
e-mail: balacynwyd@edpcs.com
Subject: Re: [Q] Using pseudoinverse in Bayes discriminant function?
From: Greg Heath
Date: Mon, 11 Nov 1996 14:53:56 -0500
On Fri, 8 Nov 1996, Dukki Chung wrote:
> Hi. Reently, I had to use Bayes classifier for a pattern classification
> problem. The Bayes discriminant function is:
> di(x) = - [ ln|Ci| + (x-mu)^t Ci^-1 (x-mu)]
di(x) = - [ ln|Ci| + (x-mui)^t Ci^+ (x-mui) -2 lnPi]
> The problem was, the covariance matrix Ci was near singular, so the
> inverse could not be calculated. So, I used pseudoinverse instead of real
> inverse. What I'm wondering is whether this is a valid, justifiable
> mathematical or statistical approach.
Yes. I've always used the pseudoinverse. The ill-conditioning of the
covariance matrix results in near zero eigenvalues corresponding to
directions in space for which the distribution has nearly a constant
value(i.e., nearly a zero variance).
> I would be appreciated for any comments, suggestions, references, or any
> pointers.
Check the eigendirections associated with the near-zero eigenvalues.
Classes with near constant values in those directions might be able to be
classified quite easily based on that fact alone.
Hope this helps.
Gregory E. Heath heath@ll.mit.edu The views expressed here are
M.I.T. Lincoln Lab (617) 981-2815 not necessarily shared by
Lexington, MA (617) 981-0908(FAX) M.I.T./LL or its sponsors
02173-9185, USA
Subject: Re: Sudaan vs SAS, SPSS, etc
From: "Thomas R. TenHave"
Date: Mon, 11 Nov 1996 09:06:07 -0500
>Date: Sat, 9 Nov 1996 23:18:06 -0500
>From: Ellen Hertz
>Subject: Re: Sudaan vs SAS, SPSS, etc
>John Roden wrote:
>
> I understand that for datasets like NCES surveys which use a multistage
>> sampling design, the cases are multiplied by a weight so they
>> approximate the original population. I think the deal w/ Sudan is it
>> able to account for this weighting when figuring the degrees of freedom,
>> otherwise you can use the square root of the weight in SPSS I think.
>>
>> I'm curious on this issue also. I know Sudan is much less user friendly
>> than SPSS. I don't think it is full featured either, is it?
>>
>> Daniel Nordlund wrote:
>> >
>> > I recently came across some information suggesting that
>> > stat packages like SAS and SPSS don't calculate std. errors
>> > for analyses of data obtained through cluster sampling. I
>> > was told that I would need to use Sudaan.
>> > It was also suggested that Sudaan was likely more
>> > appropriate for a variety of situations where one is dealing
>> > with correlated data, e.g. repeated measures or matched
>> > data. I'm not sure I believe this.
>> > Can anyone shed light on this subject for me?
>> >
>> > Dan
> SAS assumes that the observations are independent and
>that a weight in a WEIGHT statement is a count. If the data
>are clustered and/or the weights are survey weights, if the
>software does not "know" that fact, the point estimates often
>will be accurate but the standard errors will be much smaller than
>they really are.
If the sampling fraction is small and there is no clustering
in the sample design, then the standard errors produced
by SAS with the correct weights (i.e., the weights sum
to the sample size) are accurate.
In fact under these conditions, the SAS standard errors
are more efficient than those standard errors
produced by SUDAAN, which uses the sandwich estimator
regardless of whether the sampling design involves clusters.
The sandwich estimator used by SUDAAN is similar to
that used typically for GEE but without weights.
Of course, when there is clustering the SAS standard
errors are biased, and those produced by SUDAAN
are asymptotically unbiased.
Tom TenHave
Penn State University
Subject: Jacobian/model identifiability
From: John Uebersax <71302.2362@COMPUSERVE.COM>
Date: Mon, 11 Nov 1996 09:44:50 EST
Assume a crossclassification table, a suitable model, and
MLEs for model parameters. Consider the Jacobian matrix
formed with rows corresponding to model parameters, and
columns corresponding to each possible cell of the table.
A standard result says that, for the model to be identifiable,
the column rank of the Jacobian must exceed the number of
rows.
My question is this: For a large table, many or most cells
have observed frequencies of 0. Does the principle above apply
if one simply ignores cells with 0 frequencies?
Why? If so, you can use the method even with an unwieldy number
of cells--e.g, say, a 30-way table.
John Uebersax
Flagstaff, AZ
71302.2362@compuserve.com
Subject: exact confidence intervals
From: "Bego~a Campos"
Date: Mon, 11 Nov 1996 16:22:39 +0000
Hi everybody, I need help for the following problem: the other day a
colleague of mine told me "What about the exact confidence interval
for the difference of porportions?. We looked up textbooks like Agresti
(1990), Fleiss (1981), Hahn &Mecker; (1991), and manuals like that for
STATXACT and found always the same: the exact confidence interval for
a proportion, not even a word for the difference. We know we are
treating with the difference of Binomials with equal n (1) and
different p's hence we can not apply the addition property. What is
the resulting distribution?. Has it any sense to explore this
approach?.
Thanks in advance.
Begona Campos
Biostatistics Unit
School of Medicine - University of Barcelona
Subject: DDF problem / PROC MIXED
From: Ingo Hary
Date: Tue, 12 Nov 1996 02:10:17 -0800
I have a problem with the computation of denominator degrees of freedom of
fixed effects in a mixed model using SAS PROC MIXED. The model is as follows:
proc mixed method=ml;
class A B C D SUBJECT YEAR time;
model y = A B C D
TIME1 TIME2
A*TIME1 B*TIME1 C*TIME1 D*TIME1
A*TIME2 B*TIME2 C*TIME2 D*TIME2
random SUBJECT;
random YEAR A*YEAR;
repeated time /subject=SUBJECT type=ARH(1);
run;
** Note **: TIME1 and TIME2 are orthogonal polynomials in time (linear
and quadratic) which enter the model as covariables. Because of mising
data, the class variable *time* is needed in the repeated statement so
that PROC MIXED can properly align the observations according to time.
The problem with this model is that PROC MIXED computes the DDF for factors A,
B, C, D, as being equal to zero, such that F-tests, ESTIMATE and CONTRAST
statements cannot be carried out. All attempts to use the DDFM= option
(or to use REML or MIVQUE0 estimation) failed to solve the problem.
Does anybody have an idea what is wrong with that model specification?
Thanks in advance for your help.
Ingo
--
******************************************
Ingo Hary
Humboldt-University of Berlin
Faculty of Agriculture and Horticulture
Institute of Basic Animal Sciences
Lentzealle 75
14195 Berlin, Germany
Phone: 0049-30-314-71101
FAX: 0049-30-31471426
E-Mail: haryhub1@mailszrz.zrz.tu-berlin.de
******************************************
Subject: Lectureship in Australia
From: Dr Ted Catchpole
Date: Tue, 12 Nov 1996 17:03:57 +1100
-----------------------------------------
| Associate Lecturer Level A - Statistics |
-----------------------------------------
School of Mathematics and Statistics
Australian Defence Force Academy
University College
Canberra, ACT
______________________________________________________
Fixed Term 3 Years Salary $30,145 - $40,889 per annum.
______________________________________________________
The School teaches courses involving mathematics and its applications,
probability theory and statistics. An Associate Lecturer is required
with interests in statistics. Current School research interests in this
area include statistical modelling of bushfires, criminology, ecology
and data analysis in sports science. Preference will be given to
applicants with an interest in bushfires or ecology. The position will
suit someone who has a good honours degree or a Masters degree and
wishes to develop a career in statistics research and statistics and
mathematics teaching. Duties will include taking tutorials and assisting
in laboratory sessions in first year mathematics courses and higher
level statistics courses. Consideration will be given to two half-time
appointments if there are suitable applicants. The position is
available in February 1997 and will be for a fixed term of three
years. Appropriately qualified women are encouraged to apply for this
position. Knowledge and understanding of EEO/AA principles is essential.
Membership of an approved University superannuation scheme is a
condition of employment for this position.
For further information phone
Associate Professor E A Catchpole (06) 2688895,
email e-catchpole@adfa.oz.au
or Professor C Pask on (06) 2688687,
or see http://www.ma.adfa.oz.au
Applications close on 22 November 1996.
______________________________________________________
Written applications addressing the selection criteria and including
details of work experience, qualifications, contact number
(business/home), citizenship status and the names and addresses,
including facsimile numbers of at least two referees should be forwarded
to: Recruitment Officer (Personnel), University College, University of
New South Wales, Australian Defence Force Academy, Northcott Drive,
Canberra ACT 2600. Please quote reference number AO 1866.
For confirmation of receipt of applications telephone (06) 268 8726.
Subject: Wall Street Quant. Position
From: David Rothman
Date: Tue, 12 Nov 1996 08:31:27 +0100
The trading arm of a major investment firm is seeking a quantitative
specialist for its New York based Analytical Equity Trading Group to
work with its senior professionals in the on-going development of
sophisticated statistical/econometric trading models and strategies.
QUALIFICATIONS:
The successful candidate will have in-depth knowledge of financial
economics, time series econometrics, stochastic processes and the
requisite skills necessary to design and implement strategies in a
sophisticated computer environment. Comfort in dealing with
Probabilistic notions such as Random Walk, Brownian Motion and
Martingale Theory, combined with Econometric ideas such as Stationarity,
Cointegration, Error-Correction Models and Arch/Garch is essential.
This position would be ideal for someone with prior experience in a
related field, and/or academic training near or at the Ph.D. level.
CONTACT:
E-mail: nyrtd@ny.ubs.com
Please reply via email with either a resume or a short informal
description of yourself. Please include a day & evening phone number.
We are an Equal Opportunity employer.