Stats - whither points created or PATH?

tfmichael1 · Sat, 14 Apr 2001 22:59:10 -0700

There are many ways to analyze and improve upon
the statistics that academic competitions keep. The
important questions to ask with any new statistic, or any
measurement at all for that matter, are 1) what is it
measuring, 2) is it reliable, and 3) does it measure what it
purports to measure(i.e. is it valid)?

Points
Created (PC) was designed in an attempt to measure what
Randy Buehler (former Vandy great who now, IIRC, has
his dream job at Wizards of the Coast) called the
"shadow effect" -essentially, the "drag" on a players
scoring ability caused by his (or her) fellow players
beating him (or her) out to toss-ups - and to correct for
that effect in player ranking.

That stat is not
around anymore for several reasons. The first is that it
lacked what's called "face validity." On the face of it,
it had problems. While some players had their PC
scored rated positively, most ended up with scores that
had minus signs in front of them. Prima facie, most
players are not a drag on their team. A negative number
suggests a drag, although that's not what PC intended it
to mean - PC just meant that their PPG should be
adjusted downward. But it's obvious why this presentation
helped keep PC from being popularly adopted.

To
the best of my knowledge Pat Matthews only ran a
validity test on PC once. He reasoned that if PC was an
accurate measure of the shadow effect, then rankings
within teams would be preserved across tournaments. In
other words, if category distribution and difficulty
level are sufficiently randomized, then the shadow
effect should be a constant with a normal fluctuation.
(The problems with this of course are 1) actual
category distribution effects can only be measured by
precise _a priori_ measurements which no one has, to
date, released, 2) and difficulty is so highly
subjective that no one has yet bothered to propose a way to
objectively measure it - but I digress.) The results Pat came
up with weren't statisticly significant - which
given all the caveats above doesn't mean the stat is
bad, necessarily, just the way it was tested - and
came up with an interesting anomaly that ran counter
to conventional wisdom: if you interrupt a toss-up
and get it wrong, the other team that hears it to the
end has a less then .5 chance of answering it
correctly - and that _had_ a significant statistical
difference, when looked at after the fact.

Now, this
was a data set from two tournaments. PC was never, to
my knowledge, analyzed more widely -though the score
was reported for many more tournaments. I personally
would be very interested in seeing if PC was correct -
and seeing more data on the "anomaly."

PC
failed because its presentation left people with
negative scores, and because no one ever proved that it
measured what it was supposed to measure. PATH, and other
attempts at new measurements, are worthy attempts. Time,
and time only, will tell what they measure, if they
can measure it accurately, and if they can measure it
consistently.

Tom