An analysis of ten years of the four Grand Slam men's singles data for lack of independence of set outcomes


Pollard, Graham; Cross, Rod; Meyer, Denny


The objective of this paper is to use data from the highest level in men's tennis to assess whether there is any evidence to reject the hypothesis that the two players in a match have a constant probability of winning each set in the match. The data consists of all 4883 matches of grand slam men's singles over a 10 year period from 1995 to 2004. Each match is categorised by its sequence of win (W) or loss (L) (in set 1, set 2, set 3,...) to the eventual winner. Thus, there are several categories of matches from WWW to LLWWW. The methodology involves fitting several probabilistic models to the frequencies of the above ten categories. One four-set category is observed to occur significantly more often than the other two. Correspondingly, a couple of the five-set categories occur more frequently than the others. This pattern is consistent when the data is split into two five-year subsets. The data provides significant statistical evidence that the probability of winning a set within a match varies from set to set. The data supports the conclusion that, at the highest level of men's singles tennis, the better player (not necessarily the winner) lifts his play in certain situations at least some of the time.

Publication year


Publication type

Journal article


Journal of Sports Science and Medicine, Vol. 5 (2006), pp. 561-566




Asist Group


Copyright © 2006 Journal of Sports Science and Medicine.