Difference between revisions of "Misuse of statistics"
Matt Weiner (talk | contribs) |
Matt Weiner (talk | contribs) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | In | + | In quizbowl, '''misuse of statistics''' can arise by using more and more elaborate formulae, computer programs, and statistical analysis tools in a futile attempt to make up for the fact that the premises or underlying data being fed into the model are incorrect or do not exist. Proponents can both be blinded by the elegance of their algorithms and be more able to discuss their details than less mathematically-trained dissenters, giving them an easy way to bulldoze simple but correct objections to the process. It can be a sign of [[bad quizbowl]]. |
Examples include: | Examples include: | ||
− | *Tournament formats which rank teams based on factors other than win-loss through complicated formulae and end up producing self-evidently absurd results ([[2003 ICT]]) | + | *Tournament formats which rank teams based on factors other than win-loss or progress through a power matching system through complicated formulae and end up producing self-evidently absurd results ([[2003 ICT]]) |
− | *Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced | + | *Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced [to what extent this applies to the HSNCT itself is debatable, but local tournaments that use short formats with barely modified swiss pairs and then cut to the playoffs on record alone are clearly doing something wrong - a larger explanation of the math behind this is needed] |
*"Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams" | *"Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams" | ||
*[[PATH]], which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer | *[[PATH]], which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer | ||
*Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder | *Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder | ||
− | *The formula for determining the JV champion at [[NAC]], which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. | + | *The formula for determining the JV champion at [[NAC]], which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. At least seven teams in NAC history have gone undefeated without being named the tournament champion. |
*Attempts to "prove cheating" that reduce to proving the tautology that exceptional performances are outliers, then deciding which outliers constitute cheating based on the same subjective assessment of the situation that could have been made without any statistical analysis. E.g., the assertion that "anyone who powers 3 tossups in every game of a tournament must be cheating, except if I decide that this could have plausibly happened without cheating based on my own evaluation of the player's skill and the tournament's difficulty." | *Attempts to "prove cheating" that reduce to proving the tautology that exceptional performances are outliers, then deciding which outliers constitute cheating based on the same subjective assessment of the situation that could have been made without any statistical analysis. E.g., the assertion that "anyone who powers 3 tossups in every game of a tournament must be cheating, except if I decide that this could have plausibly happened without cheating based on my own evaluation of the player's skill and the tournament's difficulty." | ||
Line 13: | Line 13: | ||
The coming of [[detailed stats]] means that factors such as [[buzzpoints]], statistically meaningful assertions about the nationwide difficulty of clues or questions, and changes in player performance with various teammates and opponents present can now be measured. Greater insight into tournament difficulty and player performance may follow. | The coming of [[detailed stats]] means that factors such as [[buzzpoints]], statistically meaningful assertions about the nationwide difficulty of clues or questions, and changes in player performance with various teammates and opponents present can now be measured. Greater insight into tournament difficulty and player performance may follow. | ||
+ | |||
+ | It is also possible that real-time statistics may be used to improve tournament formats based on the progress of technology. E.g., the reason large quizbowl tournaments never use true [[swiss pair]] systems and instead rely on the card-passing [[power matching]] approximation is that waiting to calculate results from each round and then republish new pairings would take far too long under current methods of scorekeeping and statistical tabulation. If an entire tournament is using realtime scoresheets and all players can access the standings on electronic devices, then more formats become possible. | ||
[[category: bad quizbowl]] | [[category: bad quizbowl]] |
Latest revision as of 09:32, 14 April 2023
In quizbowl, misuse of statistics can arise by using more and more elaborate formulae, computer programs, and statistical analysis tools in a futile attempt to make up for the fact that the premises or underlying data being fed into the model are incorrect or do not exist. Proponents can both be blinded by the elegance of their algorithms and be more able to discuss their details than less mathematically-trained dissenters, giving them an easy way to bulldoze simple but correct objections to the process. It can be a sign of bad quizbowl.
Examples include:
- Tournament formats which rank teams based on factors other than win-loss or progress through a power matching system through complicated formulae and end up producing self-evidently absurd results (2003 ICT)
- Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced [to what extent this applies to the HSNCT itself is debatable, but local tournaments that use short formats with barely modified swiss pairs and then cut to the playoffs on record alone are clearly doing something wrong - a larger explanation of the math behind this is needed]
- "Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams"
- PATH, which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer
- Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder
- The formula for determining the JV champion at NAC, which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. At least seven teams in NAC history have gone undefeated without being named the tournament champion.
- Attempts to "prove cheating" that reduce to proving the tautology that exceptional performances are outliers, then deciding which outliers constitute cheating based on the same subjective assessment of the situation that could have been made without any statistical analysis. E.g., the assertion that "anyone who powers 3 tossups in every game of a tournament must be cheating, except if I decide that this could have plausibly happened without cheating based on my own evaluation of the player's skill and the tournament's difficulty."
Possible better uses of statistics
The coming of detailed stats means that factors such as buzzpoints, statistically meaningful assertions about the nationwide difficulty of clues or questions, and changes in player performance with various teammates and opponents present can now be measured. Greater insight into tournament difficulty and player performance may follow.
It is also possible that real-time statistics may be used to improve tournament formats based on the progress of technology. E.g., the reason large quizbowl tournaments never use true swiss pair systems and instead rely on the card-passing power matching approximation is that waiting to calculate results from each round and then republish new pairings would take far too long under current methods of scorekeeping and statistical tabulation. If an entire tournament is using realtime scoresheets and all players can access the standings on electronic devices, then more formats become possible.