Difference between revisions of "Misuse of statistics"

From QBWiki
Jump to navigation Jump to search
m (Gregory Gauthier moved page Mathturbation to Misuse of statistics: We can drop the Matt Weiner-coined juvenile title for this.)
 
(11 intermediate revisions by 4 users not shown)
Line 1: Line 1:
'''Mathturbation''' is the practice, in quizbowl, of using more and more elaborate formulae, computer programs, and statistical analysis tools in a futile attempt to make up for the fact that the premises or underlying data being fed into the model are incorrect or do not exist. It is a particularly insidious form of bad practice because its proponents can both be blinded by the elegance of their algorithms and be more able to discuss their details than less mathematically-trained dissenters, giving them an easy way to bulldoze simple but correct objections to the process.
+
In quizbowl, '''misuse of statistics''' can arise by using more and more elaborate formulae, computer programs, and statistical analysis tools in a futile attempt to make up for the fact that the premises or underlying data being fed into the model are incorrect or do not exist. Proponents can both be blinded by the elegance of their algorithms and be more able to discuss their details than less mathematically-trained dissenters, giving them an easy way to bulldoze simple but correct objections to the process. It can be a sign of [[bad quizbowl]].
  
Famous examples of mathturbation include:
+
Examples include:
*Tournament formats which rank teams based on factors other than win-loss through complicated formulae and end up producing self-evidently absurd results ([[2003 ICT]])
+
*Tournament formats which rank teams based on factors other than win-loss or progress through a power matching system through complicated formulae and end up producing self-evidently absurd results ([[2003 ICT]])
*Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced
+
*Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced [to what extent this applies to the HSNCT itself is debatable, but local tournaments that use short formats with barely modified swiss pairs and then cut to the playoffs on record alone are clearly doing something wrong - a larger explanation of the math behind this is needed]
 
*"Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams"
 
*"Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams"
 
*[[PATH]], which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer
 
*[[PATH]], which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer
 
*Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder
 
*Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder
*The formula for determining the JV champion at [[NAC]], which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. In 2014, the Chicago champion defeated the DC champion 3748 to 3255 using this system of made-up points. The DC team is perhaps the only team to ever travel to a national tournament, win all of their games, and not win the title.
+
*The formula for determining the JV champion at [[NAC]], which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. At least seven teams in NAC history have gone undefeated without being named the tournament champion.
 +
*Attempts to "prove cheating" that reduce to proving the tautology that exceptional performances are outliers, then deciding which outliers constitute cheating based on the same subjective assessment of the situation that could have been made without any statistical analysis. E.g., the assertion that "anyone who powers 3 tossups in every game of a tournament must be cheating, except if I decide that this could have plausibly happened without cheating based on my own evaluation of the player's skill and the tournament's difficulty."
  
[[Category: Quizbowl lingo]]
+
==Possible better uses of statistics==
 +
 
 +
The coming of [[detailed stats]] means that factors such as [[buzzpoints]], statistically meaningful assertions about the nationwide difficulty of clues or questions, and changes in player performance with various teammates and opponents present can now be measured. Greater insight into tournament difficulty and player performance may follow.
 +
 
 +
It is also possible that real-time statistics may be used to improve tournament formats based on the progress of technology. E.g., the reason large quizbowl tournaments never use true [[swiss pair]] systems and instead rely on the card-passing [[power matching]] approximation is that waiting to calculate results from each round and then republish new pairings would take far too long under current methods of scorekeeping and statistical tabulation. If an entire tournament is using realtime scoresheets and all players can access the standings on electronic devices, then more formats become possible.
 +
 
 +
[[category: bad quizbowl]]

Latest revision as of 09:32, 14 April 2023

In quizbowl, misuse of statistics can arise by using more and more elaborate formulae, computer programs, and statistical analysis tools in a futile attempt to make up for the fact that the premises or underlying data being fed into the model are incorrect or do not exist. Proponents can both be blinded by the elegance of their algorithms and be more able to discuss their details than less mathematically-trained dissenters, giving them an easy way to bulldoze simple but correct objections to the process. It can be a sign of bad quizbowl.

Examples include:

  • Tournament formats which rank teams based on factors other than win-loss or progress through a power matching system through complicated formulae and end up producing self-evidently absurd results (2003 ICT)
  • Power-matching systems which make teams play widely different strengths-of-schedule based on their performance in each game, but then take teams to the playoffs based on the incommensurable win-loss records produced [to what extent this applies to the HSNCT itself is debatable, but local tournaments that use short formats with barely modified swiss pairs and then cut to the playoffs on record alone are clearly doing something wrong - a larger explanation of the math behind this is needed]
  • "Correlation tests" which attempt to mathematically "prove" which tiebreakers are more fair by starting with the unproven, unaccepted assumption that the point of a tiebreaker is to "predict who will win the next game between the two teams"
  • PATH, which is laden with the assumption that all players on a team are perfect generalists and are being denied a chance to answer any tossup their teammates answer
  • Every PPG-replacement or other player- or team-ranking formula ever produced, which seeks to sidestep the fact that the data collected about quizbowl games are insufficient by manipulating that data harder
  • The formula for determining the JV champion at NAC, which involves taking arbitrary values related to teams at each site such as "the number of teams in the tournament times 10" and "average PPG of playoff opponents times 2" and adding them together, in order to compare an 8-0 team who played in DC to an 8-0 team who played in Chicago without actually arranging any way for them to play each other. At least seven teams in NAC history have gone undefeated without being named the tournament champion.
  • Attempts to "prove cheating" that reduce to proving the tautology that exceptional performances are outliers, then deciding which outliers constitute cheating based on the same subjective assessment of the situation that could have been made without any statistical analysis. E.g., the assertion that "anyone who powers 3 tossups in every game of a tournament must be cheating, except if I decide that this could have plausibly happened without cheating based on my own evaluation of the player's skill and the tournament's difficulty."

Possible better uses of statistics

The coming of detailed stats means that factors such as buzzpoints, statistically meaningful assertions about the nationwide difficulty of clues or questions, and changes in player performance with various teammates and opponents present can now be measured. Greater insight into tournament difficulty and player performance may follow.

It is also possible that real-time statistics may be used to improve tournament formats based on the progress of technology. E.g., the reason large quizbowl tournaments never use true swiss pair systems and instead rely on the card-passing power matching approximation is that waiting to calculate results from each round and then republish new pairings would take far too long under current methods of scorekeeping and statistical tabulation. If an entire tournament is using realtime scoresheets and all players can access the standings on electronic devices, then more formats become possible.