Welch's t-test

For the Compleat Fan
Locked
rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Welch's t-test

Post by rickyboy »

Lutz,

Have you ever thought about adding Welch's t-test as a primitive to newLISP? Could be another "mode" of the current t-test primitive or you could come up with another name.

Here's a newLISP implementation -- for fun, but also I had to implement it for a stat analysis I recently performed because I couldn't assume that the two population variances were equal, and people better than I at this inform me that this is the better test under this condition.

Code: Select all

(define (welch-t-test xs ys)
  (letn (nx (length xs) ny (length ys)
         nx-1 (- nx 1) ny-1 (- ny 1)
         mx (mean xs) my (mean ys)
         s2x (s2 xs) s2y (s2 ys)
         s2x/nx (div s2x nx) s2y/ny (div s2y ny)
         df (div (sq (add s2x/nx s2y/ny))
                 (add (div (sq s2x/nx) nx-1)
                      (div (sq s2y/ny) ny-1)))
         t  (div (sub mx my)
                 (sqrt (add s2x/nx s2y/ny))))
    (list mx my (ssd xs) (ssd ys)
          t df (mul 2 (prob-t (abs t) df)))))
Helpers.

Code: Select all

;; Arithmetic mean
(define (mean xs) (div (apply add xs) (length xs)))

;; Square function
(define (sq x) (mul x x))

;; Sum of squares
(define (sumsq xs) (apply add (map sq xs)))

;; The sample variance, just the numerator.
(define (s2-num xs)
  (let (n (length xs))
    (sub (sumsq xs) (div (sq (apply add xs)) n))))

;; Sample variance
(define (s2 xs) (div (s2-num xs) (- (length xs) 1)))

;; Sample standard deviation
(define (ssd xs) (sqrt (s2 xs)))
In action.

Code: Select all

> (t-test '(10 4 7 1 1 6 1 8 2 4) '(4 6 9 4 6 8 9 3 9 7))
(4.4 6.5 3.238655414 2.273030283 -1.678359739 18 0.1105533782)
> (welch-t-test '(10 4 7 1 1 6 1 8 2 4) '(4 6 9 4 6 8 9 3 9 7))
(4.4 6.5 3.238655414 2.273030283 -1.678359739 16.13523413 0.1126979729)

> ;; Now, try unequal sample sizes (which will yield different t values).
> (t-test '(10 4 7 1 1 6 1 8 2 4) '(4 6 9 4 6 8 9 3))
(4.4 6.125 3.238655414 2.356601669 -1.260037546 16 0.2257247063)
> (welch-t-test '(10 4 7 1 1 6 1 8 2 4) '(4 6 9 4 6 8 9 3))
(4.4 6.125 3.238655414 2.356601669 -1.30656126 15.90049882 0.2110406063)
Thanks and happy hacking!
(λx. x x) (λx. x x)

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: Welch's t-test

Post by Lutz »

How about the following approach, we add a second syntax pattern:

(t-test list-A list-B [true]) <--- Student's-t for independent and dependent samples
(t-test list-A list-B float-p) <--- Welch variant of Student's-t

When the float-p parameter is a probability number 0.0 < p < 1.0, a F-test for equality of variances will be performed internally. If the resulting probability value is less then the number in the float-p probability then the Welch flavor of Student's-t will be performed.

ps: one could force Welch to be performed by setting float-p to 1.0.

rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Welch's t-test

Post by rickyboy »

Sounds good to me, Lutz. But, your program, your call, of course.

I don't know about an "F-test for equality of variances", since I thought that it was not possible to get the actual population (real) variances from just some samples (or alternatively, to determine if they were equal or close to equal (without computing the pop variances themselves)). I don't know. I'm not an expert on statistics, I "just play one on TV." :)

At any rate, I'm just using Welch's because (i) I don't know beforehand if the pop variances are equal and (ii) I don't know any way to determine pop variance equality (or more generally, what is an acceptable method to determine which t-test would be "better" to run). I hope that makes sense, and thanks a lot for looking at this!
(λx. x x) (λx. x x)

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: Welch's t-test

Post by Lutz »

It all depends ;)

Welch is more computing intensive. If you have to do a lot of t-tests, you might only want to do it, if really required.

The population variances going into the F-ratio will always be estimates which get better with better N.

My approach would be: Force Welch using a 1.0 parameter when time is of no concern. If computing time is of concern, you could do the automatic approach e.g. setting p to 0.10 or 0.05 and let the t-test function decide. I suspect that differences will be neglect-able in most cases.

In any case I would retest significant or close to significant samples both ways. It's never good in statistics to blindly trust the numbers and test-criteria. Always look at the data. Especially in small samples, one outlier can do a lot of harm.

rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Welch's t-test

Post by rickyboy »

Thanks, Lutz. That's valuable advice to me; I'll keep that in mind.
(λx. x x) (λx. x x)

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: Welch's t-test

Post by Lutz »

Welch flavor of t-test done here:
http://www.newlisp.org/downloads/develo ... nprogress/

While at it, I also added a one sample t-test syntax pattern.

Any manual corrections are appreciated:
http://www.newlisp.org/downloads/develo ... tml#t-test

Ps: If you cannot make a binary to play with it, let me know.

xytroxon
Posts: 296
Joined: Tue Nov 06, 2007 3:59 pm
Contact:

Re: Welch's t-test

Post by xytroxon »

SCAT? LOL!

Doc. problem. Spelling of "Prosac" in some places and "Prozac" in others...

Calling an antidepressant brand or near brand name "depression medicine" is problematic.

"after a treatment with Prosac depression medication:"

Maybe use the word "Prosaic" as part of the "depression medication" pun.

"after a treatment with Prosaic depression medication:"

-- xytroxon ;o)
"Many computers can print only capital letters, so we shall not use lowercase letters."
-- Let's Talk Lisp (c) 1976

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: Welch's t-test

Post by Lutz »

Thanks, perhaps we should write instead: "The effect of newLISP on a group of depressed programmers" :)

http://www.newlisp.org/downloads/develo ... tml#t-test

rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Welch's t-test

Post by rickyboy »

Lutz wrote:Welch flavor of t-test done here:
http://www.newlisp.org/downloads/develo ... nprogress/

While at it, I also added a one sample t-test syntax pattern.
Gracias, hermano!
(λx. x x) (λx. x x)

rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Welch's t-test

Post by rickyboy »

Lutz wrote:Any manual corrections are appreciated:
http://www.newlisp.org/downloads/develo ... tml#t-test
s/list-vecor/list-vector/g
(λx. x x) (λx. x x)

Locked