Newsgroup sci.stat.math 12007

Articles

Subject: Sequential Significance Testing
From: david.hadorn@vuw.ac.nz (David Hadorn)
Date: 3 Dec 1996 09:00:51 GMT

I'm trying to ascertain the current "conventional wisdom" among 
philosophically minded statisticians concerning the validity of sequential 
significance testing.  On the one hand are Bayesian statisticians, who condemn 
the whole notion of sequential testing as symptomatic of the fatal flaws in 
frequentist-based Fisherian significance testing (e.g., "why should what *I* 
believe depend on how often *he* peeked at the data [using an inevitably 
arbitrary set of stopping rules]?"), and who have sometimes called sequential 
testing "sampling to a foregone conclusion."  On the other hand are the 
classical statisticians who seem to see nothing wrong with sequential testing 
as long as the .05 probability "account" is "spent" a little at a time over 
the sequential tests.  But doesn't Fisherian significance testing depend on 
*preassignment* of the experimental outcome space, and doesn't sequential 
testing violate this tenet?
The issue has become quite an important one in the field of health outcomes 
research (which is where I'm coming from), because more and more studies are 
being stopped early because of the results seen during early peeking at the 
data.  Combined with an unfortunate and increasing propensity to conduct 
studies without blinding (and without testing the inter-observer reliability 
of assessment of soft outcomes) it seems to me we are going to see more and 
more treatments given formal blessing without really being subjected to the 
full and rigorous testing we once aspired to.
Any thoughts would be welcome.  Please reply privately (or cc: me if posted to 
list) as my mail reader is unreliable over here.
Thanks!
David Hadorn
david.hadorn@vuw.ac.nz

Return to Top

Subject: Q: Optimization vs. Nonlinear Regression
From: Bernhard Treutwein
Date: Tue, 03 Dec 1996 11:10:04 +0100

Is there any difference between 
Constraint Optimization and
Constraint Nonlinear Regression 
     i.e. finding optimum parameter values of a nonlinear function
for binary response vartiables ?
Any response and clarification is welcome. 
Thanks in advance
    Bernhard Treutwein,
    bernhard@imp.med.uni-muenchen.de
    --------------------------------
    C is its own virus

Return to Top

Subject: Re: Implausible null hypotheses
From: Mark Myatt
Date: Tue, 3 Dec 1996 10:10:02 +0000

Bill Simpson  writes:
>>The "moronic" idea is to let some quasi-religious mantra decide what
>>action to take.  You have decisions to make; statistical decision theory
>>is designed to help YOU make the decision appropriate for YOU; use it,
>>instead of following the blind.
>>-- 
Often it is the implausibility of the 'Null hypothesis' that leads to a
realisation that the wrong approach is being taken. Take, for example,
comparing two methods of measuring the same thing ... you might start of
with a scatter-plot and then use OLS regression and look at the p-value
for 'F' as a way of assessing the agreement bwtween the two methods. But
... in this case the null (that the two measures are not correlated) is
absurd - so the approach is wrong. Sorry for the prosaic example.
In this example, applying approach of 'stating the null hypothesis' is
useful as long as you apply some thought to what the null hypothesis is
actually stating.
Just my tuppence.
-- 
Mark Myatt

Return to Top

Subject: Re: Meaning of Correlation Coefficient
From: Michael Kamen
Date: Tue, 03 Dec 1996 08:10:53 -0800

Robert,
Robert E Sawyer wrote:
> . . .This form lends itself to the interpretation:
> MST = "total (y-)variation" in the independent variable y,
> MSE = "(y-)variation UNexplained" by the linear model
> (MSE = 0 iff all yhat_i = y_i, i.e. all the data points lie on a straight line)
> MST-MSE = "(y-)variation explained" by the linear model
> 
> This should answer your question:
> "better fit" <-> proportionately smaller MSE <-> proportionately greater MST-MSE
> --
> Robert E Sawyer
> soen@pacbell.net
> _________________
Yes it does!  Thank you sir!  I think I need to get a new Stats book
cause the one I am using does not speak to me very clearly.  It is the
kind of book I would like to keep, though, as a reference for the day
when I have a solid hold on this stuff.
Thanks again!
Michael

Return to Top

Subject: SPSS and GLM: HELP!!!
From: Pawel Michalak
Date: Tue, 3 Dec 1996 15:09:00 +0100

Dear Statitisticians
I need an urgent help with ANOVA and ANCOVA in SPSS packet.
It mainly refers to sum of squares (SS). There are so called
"SAS type SS" you can find in literature, from I to IV.
What are their equivalents in SPSS? In SPSS there are Regression
Approach, Hierarchical Approach and Experimental Approach. How
can one relate SPSS's classification to SAS types?
Thank in advance for any help.  
Pawel Michalak
============================================================================
Home WWW Page: http://www.cyf-kr.edu.pl/~uemichal
Info: finger pawel@haldane.pop.bio.aau.dk & uemichal@kinga.cyf-kr.edu.pl

Return to Top

Subject: Re: Sequential Significance Testing
From: Rodney Sparapani
Date: Tue, 03 Dec 1996 10:14:09 -0500

David had many interesting points (at least AFIAC).  I am a
biostatistician who works on clinical trials and I have been watching
the schism that has developed between frequentists and Bayesians. 
Personally, I am a frequentist and find Whitehead's work on sequential
testing particularly appealling.  However, Bayesians poo poo all that
(as well as Obrien-Fleming).  The problem is that you have to be a
Bayesian to understand their arguments.  A lot of the Bayesian papers
that I see are incomprehensible since they are making so many
assumptions that are either alien to frequentists or simply not stated.
For instance, after I took a Brophy and Joseph paper (JAMA, March 15,
'95, Vol. 273, No. 11, pp. 871-875) to a Bayesian and had it explained
to me, I dispute the conclusion that Bayesian analysis was used at all. 
I can produce the same calculations (and get exactly the same p-values,
although they don't call them that) with meta-analysis and Fleiss'
Normal Approximation to the Binomial (w/o continuity correction).  Can't
we all get along.
Rodney (not King) Sparapani

Return to Top

Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!!
From: vanhala@cc.helsinki.fi (Tomas Vanhala)
Date: 3 Dec 1996 17:43:11 +0200

In <57v1me$i@rainbow.rmii.com> neil@rmi.net (Neil Schroeder) writes:
>Without question, Linux is the way to go.  
>My experience with Linux is with the Slackware release, and I have found it to 
>be the most feature-rich of them all.
Probably. But it has a reputation of being the most difficult
distribution to upgrade without doing a total re-install.
>Linux has more included value than any other UNIX release available (excepting 
>Berkeley Software Design's BSD/OS, a commercial product).
But the value of the add-ons sometimes suffers from poor quality. For
example, NFS in Linux is barely usable both when it comes to
performance, features (file locking) and reliability. 
>Its kernel is constantly updated, and includes specific and high-performance support for 
>more cards, motherboards, chipsets, and etc than anything else. That hardware 
>specific support alone makes it the most valuable.
But very many device drivers lack "limp-back" code, meaning that you
could be in even worse trouble with Linux than with some other OS'es
if/when your hardware starts malfunctioning.
>Linux is very easy to configure and provides solid and stable networking code. 
> It comes with a vast variety of applications and has the single widest 
>application support of any UNIX OS.
This is without doubt true.  Also, many commercial software vendors set
the price for their Linux ports at a fraction of what their products
cost on other platforms. 
>Again, I don't think I can stress enough the value of the kernel support and 
>application code available.  Linux is free, powerful, and has all you need.  
>You should definitely give it a try. 
I've found Linux easier to install and to tune to our liking than, say,
UnixWare, but maintaining a Linux host over the long run has definitely
meant more work than has been necessary to maintain a UnixWare host. 
E.g., even "professional" distribution maintainers like Red Hat only
release security patches for their very latest release, which means you
have to upgrade constantly or spend alot of time doing hacks. 
So, it really depends. For someone who maintains his workstation himself
and does not have to rely on NFS, I would not hesitate recommending
Linux. But from the point of view of someone who maintains a cluster of
workstations the perspective is somewhat different. 
-- 
Tomas Vanhala                                 vanhala@paulus.helsinki.fi
Tel. (90) 191 22097                     http://www.helsinki.fi/~vanhala/

Return to Top

Subject: Re: Basic question on P values
From: tjb@acpub.duke.edu (tom)
Date: Tue, 03 Dec 1996 11:26:31 +0000

In article <57veu2$48h@usenet.srv.cis.pitt.edu>, wpilib+@pitt.edu (Richard 
F Ulrich) wrote:
> Think of it like this:   the  p-value  of .5  represents
> the middle of the expected outcomes under the simple null hypothesis.
> That is, the cumulative distribution runs from 0 to 1, where only
> the upper extreme of scores is in the 'rejection area'.
> If you are grabbing BOTH extremes  of the distribution as part of 
> your rejection area, as you do with the two-tailed t-test, then
> it is not very natural to refer to other cut-off points.  BY 
> arbitrarily doubling the value (p)  or (1-p), you can label 
> cutoff points so that what was .5  is now 1.0.
Let me make sure I understand this by phrasing my question experimentally.
Suppose you take a set of binomia data consisting of truly random 0s and 
1s. You split that data set into two parts and check for a difference. As 
your n approaches infinity, your p should approach 1.  I know this is the 
case.
Okay, now suppose you are dealing with normally distributed data. You 
generate values distributed normally from one formula, and split that data 
set at random into two parts and check for differences. As your n 
approaches infinity, if I understand you correctly, the p will approach 
1.0 for a one tailed test, and 0.5 for a two tailed test. Is this correct?
> : An email cc: of posted replies is appreciated
> 
>   -- my Reader does not make that convenient.  Do you have Newsfeed
> problems, or are you too lazy to pop in and look for responses?
Mainly, I'm too lazy to pop in and look for responses. 
Also, even if someone is kind enough to respond the retention time at my 
site is short so if I can't check news for a couple of days I may miss the 
response.
>   -- An E-mail address included within  *your*  text would be
> appreciated, too, since some NewsReaders don't show what it was,
> and what is shown is wrong a sizable fraction of the time, anyway.
Are you sure about this? It would seem any standard newsreader would 
provide the correct email address, since a correctly posted article will 
have the return address in the From: header.
Thanks for your repy & help with my question.
Tom

Return to Top

Subject: multivariate regression
From: etsae@csv.warwick.ac.uk (Mr C P Quigley)
Date: 3 Dec 1996 16:47:43 -0000

If I have a set of data with many variables used to predict a single
output variable, what kind of methods are available to do this.  As I
understand it, this is much more complicated than standard regression
modelling with just one input, one output variable.
-- 
-------------------------------------------------------------------
Chris Quigley,Advanced Technology Centre,University of Warwick,
Coventry CV4 7AL,United Kingdom.

Return to Top

Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!!
From: Alain Cardinal
Date: Tue, 03 Dec 1996 12:25:02 -0500

Evan Leibovitch wrote:
> 
> In article <329CB17A.C9F@ucla.edu>, Bryan Austin   wrote:
> 
> >I am in the market for a UNIX operating system. I have narrowed the
> >search down to three 3 prospects: SCO UNIX 2.1, Solaris x86 UNIX, and
> >Lunix. My question is, which of the three is the best choice, and more
> >importantly, Why? I will be using the operating system for business and
> >personal use.
> 
> >I am positive that all three OSs have some strengths and weaknesses.
> >This has been my method of evaluation so far. If anyone can help please
> >reply.
For personal use: LINUX is good choice, the price is important.
For business use: LINUX is dangerous.
SCO is most stable.
Solaris is most stable and you can upgrade to better computer system.
Why do you use UNIX system? for application, not for only UNIX system.
Usually, in business, you should not restart your computer at each days.
You should have stability of operating system.
You have not the choice of Great application (Oracle, Informix, ...)
with linux.
PC is not computer stable.  When you have computer room with 20 servers
(10 SGI and 10 PC).  You should restart your PC at regular time.  For
SGI, you forget to shutdown your SGI one year after.
SGI, SGI, OSF and AIX is very stable.
If you have any problem with Oracle product or Netscape. On SGI and SUN
that's funny for support.  With LINUX, The compagnie Oracle and other
you respond "just select good machine and you will have good support".
Why do you want? UNIX PC for games and Words, or UNIX for business
application.
UNIX SYSTEM ADMINISTRATOR
cardinaa@sit.qc.ca

Return to Top

Subject: Re: Logit & Probit by TSP
From: doncram@leland.Stanford.EDU (Donald Peter Cram)
Date: 3 Dec 1996 10:01:06 -0800

In article <32A3F2B3.1560@students.wisc.edu>,
Tatsuo Ochiai   wrote:
>I am wondering what the algorithms and the convergence criteria TSP uses
>for Logit and Probit model.
>
>
>Thanks in advance.
>
>
>Tatsuo Ochiai
>tochiai@students.wisc.edu
See the "Method" section under Probit and under Logit in the TSP
Reference Manual.
regards,
Don Cram
-- 
doncram@gsb-ecu.stanford.edu
http://www-leland.stanford.edu/~doncram

Return to Top

Subject: Re: Logit & Probit by TSP
From: doncram@leland.Stanford.EDU (Donald Peter Cram)
Date: 3 Dec 1996 10:03:27 -0800

In article <32A3F2B3.1560@students.wisc.edu>,
Tatsuo Ochiai   wrote:
>I am wondering what the algorithms and the convergence criteria TSP uses
>for Logit and Probit model.
>
>
>Thanks in advance.
>
>
>Tatsuo Ochiai
>tochiai@students.wisc.edu
See the "Method" sections under Probit and under Logit in the TSP
Reference Manual.
regards,
Don Cram
-- 
doncram@gsb-ecu.stanford.edu
http://www-leland.stanford.edu/~doncram

Return to Top

Subject: Re: Query: smoothing-spline software
From: Rodger Whitlock
Date: 3 Dec 96 09:56:03 PST

Ed Hughes  wrote:
>Is there any (preferably free) software available to 
>compute smoothing splines for data on points in 2 and 3 
>dimensions?  I'd prefer that it allow irregular data 
>points, but if it required points to be on a regular
>grid, I can live with it.
I've used Gaussian and binomial smoothing with excellent results in 2 
dimensions.
----
Rodger Whitlock

Return to Top

Subject: Re: UNIX OPERATING SYSTEM, WHICH ONE!!!!!!!
From: root@xinside.com (Jon Trulson)
Date: 3 Dec 1996 19:30:09 GMT

-----BEGIN PGP SIGNED MESSAGE-----
In comp.unix.unixware.misc Raymond N Shwake  wrote:
> evan@bigbird.telly.org (Evan Leibovitch) writes:
> >SMALL SYSTEMS/WORKSTATIONS: Linux
> >.... Commercial implementors of
> >Linux such as RedHat and Caldera now bundle X servers which outperform
> >UnixWare's. Linux takes less horsepower to be runnable than UnixWare;
> >a non-X Linux system can still run fine on a 386/20 with 8MB RAM.
> >Linux co-exists much better on multi-OS, multi-boot system;
> >UnixWare has no equivalent to LILO.
	Ahh well. LILO is nice, but a pain in the ass sometimes... OSBS works 
fine for me at home, though LILO (and now System Commander) rules here at work... 
I'd hardly say that Linux is better because of LILO though ;-)
> 	I never cared for LILO; to me it's proven more trouble than it's
> worth. I've been using osbs20b8 for the last year or so with much success.
> (Was a later version ever released?) BTW, I started running Destiny (the
> original SVr4.2) on a NEC 386/25 with 8 MB - X included - and even the
> later 1.1 on a 486/33 with as little as 8 MB, though I'd never try that
> with current UnixWare 2.x.
	Hehehe.  When I was at IF and we started selling Destiny, I once loaded
it (base only) on a 386SX/16 with 4MB of RAM as a test.  It actually ran, though
pretty sloooowwwwlyy. ;-)  We did use one of those boxes (w/ 8MB) as an intranet
gateway, and it performed flawlessly during the 2 years it ran... 
- --
Jon Trulson    work: jon@xinside.com, home: jon@radscan.com
PGP ID: 1A9A2B09, Fingerprint: C23F328A721264E7 B6188192EC733962 
PGP public key at finger:trulson@rainbow.rmii.com or keyserver
We are Grey.  We stand between the Candle and the Star.
#include 
-----BEGIN PGP SIGNATURE-----
Version: 2.6.2
iQEVAwUBMqR/aJv9MhmQHeeBAQFPnwf+PGuf03pZzkeQihb/2a8nx0+ml/puPjGK
bdf4B7aAcUQgJhcxVE+tIXwrptnFifBM36uZ49JfXUR2xBWZzTLEqd07GYfBQWIr
hG8x4oCt4EJ9ApqqBgWH7U+YabEJDVAdFnqQPgF/Nd/JKKSV0d+hAdUFuIDUJs0U
gaIZbZb13cHBkSLksKHepU60Y+jM6qXzs9i1UubHxOZMa8qi2UXNubt0fdZJjmH/
/11e6RSddD+Y5cBMTZTIMkXsWnhIlzgTvT531lmUZH1xqWa0eoNOVdWho7qIjXWY
9BmBT6+usT/AhDzG0xuRUH+tulgGgESHlVC6s5qzDe95qJCjTQGJYg==
=XkO8
-----END PGP SIGNATURE-----

Return to Top

Subject: Re: Confidence Limits for mu -- non-normal distribution
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 3 Dec 1996 19:31:21 GMT

Michael Kamen (mbkamen@facstaff.wisc.edu) wrote:
<< deleted, comments from previous posts...>>
: I am analyzing the waits separately from the service times.  The 
: total waiting time is really a sum of various idle periods for 
: the patient in the system.  It is a bit like looking at the total 
: time a workpiece spends in buffers in a manufacturing line.  
: The reason I am not treating this as a Poisson process is because 
: the start of service is supposed to be a scheduled event. It is 
: not the interarrival time of patients that matters here, but the 
: lag from the scheduled time to the actual start of service.  
: Perhaps there is something I am missing though.  It would be 
: interesting to analyze this as a Poisson; I just decided not to 
: do that for the reason given.   
 -- Okay, if the start of service is supposed to be scheduled,
then, WHAT GOES WRONG?  Why are there any waits?  Is someone
taking longer than expected to deal with the previous patient?  Or
are they just killing too much time by the water cooler?
It seems to me that some insights from ordinary queuing theory
still apply.  If your services are supplied at CLOSE to the
maximum possible, then, potentially, you can expect some VERY 
LONG lines.  All it takes is a little bit of disruption, a little
bit of slowdown, and the delay for one customer becomes a
delay for EVERY customer after -- if you do not have some way of
speeding up or discarding the longest jobs, or adding servers
when the lines start to build up.
I still think that characterizing the time required for various
"tasks"  is apt to be helpful.  A task might "hardly ever" be
a problem, but a few "hardly-evers"  might add up to something
noticeable.
Rich Ulrich, wpilib+@pitt.edu

Return to Top

Subject: Re: multivariate regression
From: ebohlman@netcom.com (Eric Bohlman)
Date: Tue, 3 Dec 1996 19:31:26 GMT

Mr C P Quigley (etsae@csv.warwick.ac.uk) wrote:
: If I have a set of data with many variables used to predict a single
: output variable, what kind of methods are available to do this.  As I
: understand it, this is much more complicated than standard regression
: modelling with just one input, one output variable.
What you're referring to is called "multiple regression" ("multivariate 
regression" refers to regression models with multiple *dependent* 
variables) and it really isn't different from linear regression with a 
single independent variable (which is just a special case of the more 
general method).  The main difference is that you have to worry about 
high correlation between independent variables making it difficult to 
distinguish their contributions to the dependent variable 
(multicollinearity) and you have to take into account different scaling 
among the independent variables (done by standardizing coefficients).

Return to Top

Subject: Re: Basic question on P values
From: wpilib+@pitt.edu (Richard F Ulrich)
Date: 3 Dec 1996 20:04:49 GMT

tom (tjb@acpub.duke.edu) wrote:
 << deleted, citing me ... I don't know how I could be so
misunderstood...>>
: Let me make sure I understand this by phrasing my question experimentally.
: Suppose you take a set of binomia data consisting of truly random 0s and 
: 1s. You split that data set into two parts and check for a difference. As 
: your n approaches infinity, your p should approach 1.  I know this is the 
: case.
  -- NO!  Actually, by sampling theory, your "p-level" has a 5% chance of
being above 95% cumulative (one tail);  or a 5% chance be being below
5% cumulative....  I do not know what you have in mind that you think
"is the case".
: Okay, now suppose you are dealing with normally distributed data. You 
: generate values distributed normally from one formula, and split that data 
: set at random into two parts and check for differences. As your n 
: approaches infinity, if I understand you correctly, the p will approach 
: 1.0 for a one tailed test, and 0.5 for a two tailed test. Is this correct?
  -- NO!  For the symmetric curve, the "likeliest" result (that is,
where the probability-density is greatest) is, as it happens, at the 
middle;  where the cumulative distribution is  (it should be obvious,
if you think of the definition for "cumulative distribution") .5 .
Any artificial two-tail test, taken by adding together two extremes,
is a BASTARDIZED test, and you can call the middle,  "p-level= 1.0",
by *doubling the p-levels*, just like you did for the extremes.  But
that is a procedure that is  _ad hoc_.
If you look at the p-level for an F-test where the F is the square of
the t-test, the large Fs correspond to the low (negative), plus HIGH ts.
But you COULD consider the F-test as two-tailed, for large, absolute
differences, plus the SMALLEST differences, so the F with a small
value, for "equal means",  would be considered an unusual instance,
and part of the rejection region.  (You don't often want to do this,
but I am trying to point out the logic, and the possibility.)
Tom: > : An email cc: of posted replies is appreciated
me: >   -- my Reader does not make that convenient.  Do you have Newsfeed
: > problems, or are you too lazy to pop in and look for responses?
Tom: Mainly, I'm too lazy to pop in and look for responses. 
  -- so, I should extend myself for your pleasure, in hopes that 
you Highness will smile upon me and bless me some day ....?
Tom: Also, even if someone is kind enough to respond the retention time
at my 
: site is short so if I can't check news for a couple of days I may miss the 
: response.
me: >   -- An E-mail address included within  *your*  text would be
: > appreciated, too, since some NewsReaders don't show what it was,
: > and what is shown is wrong a sizable fraction of the time, anyway.
Tom: Are you sure about this? It would seem any standard newsreader would 
: provide the correct email address, since a correctly posted article will 
: have the return address in the From: header.
  -- I don't use the REPLY-by E-mail  feature very often, and I
have not kept a running count, but a few dozen letters worth of
experience suggests that:  STAT readers who do not include their
E-mail addresses have improper  "From:"  headers at least one fifth of 
the time.  There is a fairly large fraction of readers, also, who do
list a different E-mail address in their text, but (like mine used to
be) that could reflect a preference, rather than potential error.
Rich Ulrich, biostatistician              wpilib+@pitt.edu
Western Psychiatric Inst. and Clinic   Univ. of Pittsburgh

Return to Top

Subject: Job Opportunity:Statistical Quality Engineer
From: ambgrp@aol.com
Date: Tue, 03 Dec 1996 15:11:04 -0600

Here's a job opportunity. If you are interested and qualified, please see contact information below. If you know of someone who might be interested and qualified, we appreciate the referral! Thank you.
Position Description: Statistical Quality Engineer
Will evaluate, develop, train, and support statistical methods including
SPC, classical and Taguchi design of experiments, correlation/regression
analysis, probability theory, hypothesis testing, signficance tests,
acceptance sampling plans, Wiebull analysis and applied reliability
methods throughout 19 OE plants and corporate
office.
Position Requirements:
Undergraduate degree in Statistics or Engineering with significant
concentration in Statistics. Advanced degree
preferred.
Requires 3-5 years industrial and process improvement experience.
Superior presentation and classroom teaching experience required. This
position interacts with mid and upperlevel management to provide stats
support, training, and updates regarding statistical method
implementation and applications. Substantial interaction with technical
personnel from all divisions and departments is required as is
substantial PC and stats software
skills.
This is a newly-created position and reports to the Quality Assurance
Manager..
Salary range:
Depending on experience, $45,000 to 59,300
Company Information:
Modine Manufacturing, located in Racine, Wisconsin, is an independent
worldwide leader in heat transfer technology, serving the vehicular,
industrial, commericial, and building HVAC markets.
Contact:
Andy Lane
The Ambrose Group
Milwaukee, WI
1-800-925-8244
ambgrp@aol.com
414-273-8250 (fax)
-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet

Return to Top

Subject: Re: Output unit scaling ?
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Date: Tue, 3 Dec 1996 22:49:51 GMT

In article , ebx@cs.nott.ac.uk (Edward A G Burcher) writes:
|> ...
|> I am trying to build a 7-10-10-3 feedforward neural net, with full
|> connectivity between successive layers but no (direct) connectivity between
|> non-adjacent layers. I am currently using the standard sigmoid function
|> as my activation function. The problem is that I have training and test data
|> where all seven inputs typically vary in a small range 0-0.3 . I am quite happy
|> to normalise this data; However, my 3 output units are tricky to deal with.
|> One of them varies in the range 200-20000, the second from 100-300 and the
|> third 20-70. Clearly, with such large numbers, the network will find it 
|> difficult (impossible?) to be trained on such data, as my activation function
|> only gives output in the [0,1] range. 
See "Why use activation functions?" and "Should I normalize/standardize/rescale 
the data?" in the Neural Network FAQ, part 2 of 7: Learning, at
ftp://ftp.sas.com/pub/neural/FAQ2.html
|> I have heard it is possible to adapt the
|> sigmoid function to give a nonlinear activation function with a larger range.
|> How is this done exactly, and would it be a suitable technique for solving the
|> problem ?
...
|> I was thinking of something simple as a scaling function such as
|> 
|> (unit value - min value the unit can take ) / ( max value that unit can take - min value )
|> 
|> Is this suitable (assuming all of this is a valid approach) ...
Yes.
I have restricted follow-ups to comp.ai.neural-nets.
-- 
Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

Return to Top

Subject: Re: Logit & Probit by TSP
From: clint@leland.Stanford.EDU (Clint Cummins)
Date: 3 Dec 1996 15:35:58 -0800

>Tatsuo Ochiai   wrote:
>>I am wondering what the algorithms and the convergence criteria TSP uses
>>for Logit and Probit model.
Donald Peter Cram  wrote:
>See the "Method" sections under Probit and under Logit in the TSP
>Reference Manual.
    That handles the algorithm (Newton-Raphson, using analytic second
derivatives; a pretty standard method).  The convergence criteria is
described in Section 10.1 of the TSP User's Guide.  The default is that
the relative change in the parameters (in the final iteration) is less
than .01 or .001 (controlled by the TOL option; see Section 10.7 or
NONLINEAR in the Reference Manual).
Clint Cummins
(TSP tech support)

Return to Top

Subject: Call for Papers.
From: "M. Bennamoun"
Date: Wed, 04 Dec 1996 09:09:28 -0800

******************************************************
                                         CALL FOR PAPERS
                                         Focus Symposium on 
                                "Image Processing and related Fields."
                                      To be held within the SCI'97
                 WORLD MULTICONFERENCE ON SYSTEMICS,
                            CYBERNETICS AND INFORMATICS
                                             Caracas, Venezuela
                                                July 7-11, 1997
******************************************************
PURPOSE OF THE SYMPOSIUM
`````````````````````````````````````````````
The purpose of this symposium is to gather researchers in the
area of Image Procesing and related fields, such as Computer Vision,
Robotics, Multimedia, Pattern Recognition, and others.
Image Processing includes all the various operations that can be applied 
to
image data. Two principal applications are the prime movers of the 
interest in Image Processing methods: improvement of pictorial 
information for human interpretation, and processing of scene data 
for autonomous machine perception.  The applications of this area
are numerous.  There are applications in the medical field (medical 
imaging), in remote sensing, and in the military such as the 
development of Automatic Target Recognition (ATR) systems.  
The recent advances in this area resulted from the application of new
mathematical theories, such as wavelets, or using models consonant 
with current views regarding neurophysiology and psychophysics 
of human perception.
The goal of this symposium is to present the recent advances in the 
area of Image Processing and its applications, such as phase-based motion 
estimation, image compression using Wavelet Transforms, etc. and to 
gather speakers from different disciplines, namely computer science, 
mathematics, physics, and psychology.
TOPICS
```````````
Among the topics which will be covered (this list is not restrictive):
Image Compression Using Wavelet Transforms.
Vision Systems for Automatic Object Recognition.
Recent Advances in Edge Detection.
Phase-based methods for Motion Estimation 
Applications of ANNs to Image Processing.
Texture Analysis and Applications.
Scene Analysis.
Image Segmentation.
Character Recognition.
Stereo Vision.
Active Vision.
Artificial Neural Networks applied to Image Processing.
Sensing of Unknown Environments
Vision-based Vehicle Navigation
Vision-based Guidance of Unmanned Vehicles
SCHEDULE
````````````````
A)  Please submit your three (3) pages abstract by 4 February 
1997 to:
Dr. Mohammed Bennamoun,
Signal Processing Research Centre,
School of Electrical & Electronic Systems Engineering,
Queensland University of Technology,
GPO Box 2434, Brisbane, QLD 4001,
Australia
Ph.: Int'l: 61-7-3864-1204
Fax: Int'l: 61-7-3864-1516
Email: m.bennamoun@qut.edu.au
B) Acceptance and Notification: 15 March 1997.
C) Submission of Camera Ready Paper: 15 May 1997
***********************************************************************
ACADEMIC AND SCIENTIFIC SPONSORS
``````````````````````````````````````````````````````````` 
* World Organization of Systemics and Cybernetics (WOSC) (France)
* IFSR: International Federation for Systems Research (Austria/USA)	
* International Systems Institute (USA)
* CUST, Engineer Science Institute of the Blaise Pascal University 
(France)
* The International Institute for Advanced Studies in Systems Research 
and
   Cybernetics (Canada)
* Society Applied Systems Research (Canada)
* International Institute of Informatics and Systemics (USA)
* IEEE (Venezuela Chapter)
* Simon Bolivar University (Venezuela)
* Universidad Central de Venezuela
***********************************************************************
For further Information about the SCT976, contact:
 	Prof. Nagib Callaos (Chair)
 	IIIS
 	14269 Lord Barclay Dr., Orlando, Florida 32837, USA
 	TEL/FAX): 1-407-8566274
 	e-Mail: iiis@aol.com
 	Simon Bolivar University
 	Dpto. Procesos y Sistemas
 	A.P. 89000, Caracas, Venezuela
 	TEL/FAX (office):+58-2-9621519/1325
 	e-Mail: 70501.2363@CompuServe.com
 	TEL/FAX (home): +58-2-9638852
For information about the Focus Symposium above contact: 
---------------------------------------------------
Dr. Mohammed Bennamoun, 
Signal Processing Research Centre,
School of Electrical & Electronic Systems Engineering,
Queensland University of Technology,
GPO Box 2434, Brisbane, QLD 4001,
Australia
Ph.: Int'l: 61-7-3864-1204
Fax: Int'l: 61-7-3864-1516
Email: m.bennamoun@qut.edu.au
---------------------------------------------------

Return to Top

Subject: Re: SPSS and GLM: HELP!!!
From: lthompso@s.psych.uiuc.edu (Laura Thompson)
Date: 4 Dec 1996 00:22:24 GMT

Pawel Michalak  writes:
>Dear Statitisticians
>I need an urgent help with ANOVA and ANCOVA in SPSS packet.
>It mainly refers to sum of squares (SS). There are so called
>"SAS type SS" you can find in literature, from I to IV.
>What are their equivalents in SPSS? In SPSS there are Regression
>Approach, Hierarchical Approach and Experimental Approach. How
>can one relate SPSS's classification to SAS types?
>Thank in advance for any help.  
>Pawel Michalak
If it's still the same as it has been in the past, type III SS are
equivalent to method=unique and type I are equivalent to method=
sequential.  I do recall a method=experimental, but I don't remember 
what that does.
Well, those I think are the most typically used.  Type II SS 
partial out all effects that contain no term in the hypothesis
term.  I can't remember what type IV are, but they're like III,
except when you have missing cells or something like that.
>============================================================================
>Home WWW Page: http://www.cyf-kr.edu.pl/~uemichal
>Info: finger pawel@haldane.pop.bio.aau.dk & uemichal@kinga.cyf-kr.edu.pl

Return to Top

Subject: Re: Q: Optimization vs. Nonlinear Regression
From: clint@leland.Stanford.EDU (Clint Cummins)
Date: 3 Dec 1996 15:40:49 -0800

In article <32A3FC7C.41C67EA6@tango.imp.med.uni-muenchen.de>,
Bernhard Treutwein   wrote:
>Is there any difference between 
>
>Constraint Optimization and
>Constraint Nonlinear Regression 
>     i.e. finding optimum parameter values of a nonlinear function
>
>for binary response vartiables ?
    Yes.  Although both method will produce consistent parameter estimates,
the estimated standard errors will not be correct for the
nonlinear regression, due to heteroskedasticity.  This can be partly
fixed if the package you use can compute "heteroskedastic-consistent"
standard errors.  I take it that for a logit, you are fitting a nonlinear
regression function like   D = 1/(1+exp(-(a+b1*x1 + b2*x2 ... )))  ,
where  D  is the binary dependent variable.
Clint Cummins

Return to Top

Subject: Frequency Response Function
From: Benjamin Roberts <"no"@spam -- real address in sig.>
Date: Wed, 04 Dec 1996 09:24:39 +0900

Can anyone direct me to some C source code for computing the frequency
response function of linear filters for use in the analysis of time
series. Thanks a lot. 
-- 
Benjamin Roberts 
Cognitive Laboratory 
Psychology Department 
University of Western Australia
benjamin@psy.uwa.edu.au

Return to Top

Subject: '97 Courses
From: "University Associates of Princeton"
Date: 4 Dec 1996 02:12:28 GMT

University Associates of Princeton's Applied Statistics and Management
Science series of courses for 1997 is now available at
http://www.pcnet.com/~uapinet
Additionally, an overview of the 1997 schedule can be downloaded from the
"Information and Services Section" of the WWW page by selecting "From the
Desk of the Executive Director".
Also available for download now is the course outline for "Business
Statistics: Decision Making with Data".

Return to Top

Subject: Practical vs Educational
From: Antonio Black
Date: Tue, 03 Dec 1996 19:06:39 -0800

Hello fellow stats people of the globe,
I have a question directed to the professionals with real world
application of the many things we business stats majors learn.
I was recently dinged from a 99 to a 93 on an exam for not putting in
the hypothesis test text on my exam.  Fine, this bothers me a little,
but my question is for the future.
I am wondering what purpose these formalities are trying to fulfill.  It
would seem, as
doing the multivariable regression, stepwise, and coming up with the
correct answer, the rest would be almost irrelevant.  All my college
career, I seem have to been coming up with the correct process, but
making a mistake integrating the derivative or something, yet when I
understand and am as capable as I am in these more advanced stat
classes, stuff I really understand and can apply, the little mistake
blew my high A.
(to be honest, I had to just about make things up as I went along, I had
no idea what some of the data meant, but I did manage to figure it out,
and later that night managed to boast to my friends how I just about
made up my own theorem and practices out of a confusing set of numbers
during my exam)
About my grade, I don't mind, and I'm not bitter.  My grade will not
suffer to much due to all of the hard work I have put in this semester. 
I am however concerned with my future in the industry.
Is my end user going to want me to follow non essential
procedures, put boxes around this and red ink that?
(that's what I consider the [h0 Bo=0 : ha Bo<>0] to be)
McKinsey & Company, Anderson Consulting, What do you think?
I, Antonio Fraser, would greatly appreciate any guidance from those
professionals and otherwise in this industry who have been there, with
the deadlines, and all of their firms money/reputation on the line.
I will be peer to many of you in the near future, so please share your
experiences with me.
(otherwise, you�ll be dealing with another ignorant cherry)
Are you going to turn me loose on a project and tell me what the end
result should be, which questions to research, which answers to find,
etc.? Is there freedom and trust and professional courtesy in the
consulting industry?  Or, have I picked a new level methodical process
to enter, some advanced non practical/ non applicable set of data and
results to say I understand just to float our client�s boat?
I hope everybody replies and tells me great stories of new adventures in
problem solving, creative decision making, and new techniques /
solutions to time old problems. I surely hope for a wonderful adventure
in new territory, otherwise I would have picked a methodical service
industry to manage instead of setting goals for the supposedly
trend-setting consulting firms of the world.
Please respond via email if you have the time,
Tony
tonythor@sfsu.edu

Return to Top

Subject: Re: multivariate regression
From: aacbrown@aol.com
Date: 4 Dec 1996 04:08:36 GMT

etsae@csv.warwick.ac.uk (Mr C P Quigley) in
<581ljf$rsk@crocus.csv.warwick.ac.uk> writes:
> If I have a set of data with many variables used to predict a
> single output variable, what kind of methods are available to
> do this.  As I understand it, this is much more complicated
> than standard regression modelling with just one input, one
> output variable.
It's not really more complicated, it's just harder to select the correct
model. With N independent variables, you have 2^N possible sets of
predictors; when you consider transformations and cross-products, the set
of possible models multiplies quickly. Random chance dictates that some of
these models will appear to fit your data well, even if there is no real
effect.
A simple first approach is stepwise multiple regression, available from
any standard statistics package. You run linear regressions, adding and
subtracting independent variables one at a time until you arrive at a
satisfactory solution.
Aaron C. Brown
New York, NY

Return to Top

Subject: Re: growth, decline, steady state (roughly), or just outright fluctuation
From: nakhob@mat.ulaval.ca (Renaud Langis)
Date: Wed, 04 Dec 1996 03:05:00 GMT

On 2 Dec 1996 09:25:42 GMT, "H�kon Finne"  wrote:
>I have a large number of data sets, each of which contains a time series of consecutive annual
>observations, with a maximum of ten years for each set. There is a lot of fluctuation in the data.
>I need an algorithm that will section the data (according to the values of a particular variable)
>into periods characterized by growth, decline, steady state (roughly), or just outright
>fluctuation.
>Each period should last at least two or three years and the characterization should agree
>fairly well with subjective judgment when looking at a graph of the data. As I see it, one problem
>lies in determining inflection points that define the beginning/end of each period.
>If possible, please also give hints to how the algorithm could be implemented in SPSS!
>Thanx. (And yes; this is an econometric problem.)
>Mail, please, to Hakon.Finne@ifim.sintef.no
>
>
Well, you could simply compute a (large) smoothing of your data. then compute
the slope of the new curve in each point. Set some values that will tell if the
curve is in a growth,steady or decline state. I think these values should be a
function of the width of the smoothing function. Another way of doing it is to
count the number of consecutive positive (negative) values of the slope. If you
have more than a certain number of consecutive positive (negaive) values, then
the series is growing (declining). Otherwise the series is steady.
This does not ensure though that the sub series will be at least 2 or 3 years
long. Depends on the data.
I don't know how to implement this on SPSS but i don't think it would be hard.
just use a movering average as your smoother. SPSS Trend can do this (i think).
R

Return to Top

Subject: Re: Occam's razor & WDB2T [was Decidability question]
From: ikastan@alumnae.caltech.edu (Ilias Kastanas)
Date: 4 Dec 1996 06:20:02 GMT

In article <32A0FD04.F96@postoffice.worldnet.att.net>,
kenneth paul collins   wrote:
>Ilias Kastanas wrote:
>
>>         "Analogue" measures "continuous" quantities, and so yields appro-
>>    ximate answers (which of course may be fully adequate for various prac-
>>    tical problems). 
>
>I must "move on", but I'll tell you, Ilias, "continuous" ("analogue") is no 
>more-correlated to "approximate" than is "digital". What's correlated to 
>"approximate" is a practictioner's willingness to do the work inherent in 
>ferreting out "exactness". 
	There isn't anything that can be measured to arbitrary precision.
   One may be fully willing and indifferent to cost and effort; it makes
   no difference.
>>    "Digital" (discrete), with exact answers, is different
>>    (and classical computability theory applies).
>
>Here's where we differ. "Discrete" Math is "exact" in terms of of its own 
>definitions, and =only= in those terms. And when one looks at such 
>"discreteness" through a differnet "lens", one sees that the "discrete" rules 
>are subsets of the "analogue" rules.
	What _are_ these other terms?
	How do you prepare a length, or voltage, or pressure, to have an
   _exact_ value?
							Ilias

Return to Top

Downloaded by WWW Programs
Byron Palmer

Newsgroup sci.stat.math 12007

Directory

Articles