There are many ways to analyze and improve upon the statistics that academic competitions keep. The important questions to ask with any new statistic, or any measurement at all for that matter, are 1) what is it measuring, 2) is it reliable, and 3) does it measure what it purports to measure(i.e. is it valid)? Points Created (PC) was designed in an attempt to measure what Randy Buehler (former Vandy great who now, IIRC, has his dream job at Wizards of the Coast) called the "shadow effect" -essentially, the "drag" on a players scoring ability caused by his (or her) fellow players beating him (or her) out to toss-ups - and to correct for that effect in player ranking. That stat is not around anymore for several reasons. The first is that it lacked what's called "face validity." On the face of it, it had problems. While some players had their PC scored rated positively, most ended up with scores that had minus signs in front of them. Prima facie, most players are not a drag on their team. A negative number suggests a drag, although that's not what PC intended it to mean - PC just meant that their PPG should be adjusted downward. But it's obvious why this presentation helped keep PC from being popularly adopted. To the best of my knowledge Pat Matthews only ran a validity test on PC once. He reasoned that if PC was an accurate measure of the shadow effect, then rankings within teams would be preserved across tournaments. In other words, if category distribution and difficulty level are sufficiently randomized, then the shadow effect should be a constant with a normal fluctuation. (The problems with this of course are 1) actual category distribution effects can only be measured by precise _a priori_ measurements which no one has, to date, released, 2) and difficulty is so highly subjective that no one has yet bothered to propose a way to objectively measure it - but I digress.) The results Pat came up with weren't statisticly significant - which given all the caveats above doesn't mean the stat is bad, necessarily, just the way it was tested - and came up with an interesting anomaly that ran counter to conventional wisdom: if you interrupt a toss-up and get it wrong, the other team that hears it to the end has a less then .5 chance of answering it correctly - and that _had_ a significant statistical difference, when looked at after the fact. Now, this was a data set from two tournaments. PC was never, to my knowledge, analyzed more widely -though the score was reported for many more tournaments. I personally would be very interested in seeing if PC was correct - and seeing more data on the "anomaly." PC failed because its presentation left people with negative scores, and because no one ever proved that it measured what it was supposed to measure. PATH, and other attempts at new measurements, are worthy attempts. Time, and time only, will tell what they measure, if they can measure it accurately, and if they can measure it consistently. Tom
This archive was generated by hypermail 2.4.0: Sat 12 Feb 2022 12:30:44 AM EST EST