"Thus, for each team, P/20TH divided by the number of losses (normalized based on a 13 game schedule and adjusted for field strength) should be an accurate method for ranking the "bubble" teams." I am not an expert in statistics; however, I would strongly suspect that claim. In general: [1] Saying that two effects are linear does not allow one to claim that the ratio of the two effects should also be a good predictor. [2] There is no good rationale for saying that if the number of games played is less than 13, simply "scaling" the results up to 13 games is justified. [If, for example, the NE sectional consisted of a double round robin instead of a best 2-of-3 playoffs, Yale, MIT, and Williams would each have faced off against each other an additional time each. Would the results of the second matchup have broken down the same way they did the first time (Yale 2-0, Williams 1-1, MIT 0-2)? Quite possibly. Would they always? Probably not. Both of MIT's losses were on the last tossup; conceivably, if the matches were replayed an infinite number of times, those three teams would each have roughly equal numbers of losses and wins. Therefore, should we penalize MIT and Williams and reward Yale for statistically small samples? Probably not. Dividing by losses, rescaled to 13, accomplishes precisely that result. [Furthermore, comparing "tournament standings" and win-loss records only make sense if all the SCT tournaments use the same structure: round-robins only. Any elimination structure beyond that, and teams will have inequal number of meetings, which make it difficult to say that one team is better than another. To see this, consider a tie for fourth in a four-team single-elim playoff. If the team that wins the tie-breaker then loses in the first round of the playoff, as it (normally) should, it accrues an extra loss the other team did not have.] (continued . . .)
This archive was generated by hypermail 2.4.0: Sat 12 Feb 2022 12:30:43 AM EST EST