Equalising race distances for men and women in XC - a statistical analysis 

Preface:
I’m not an expert, just an athlete with nothing better to do
This data was pulled from publicly available sources
This is only a case study, not a comprehensive investigation


Preface:
I’m not an expert, just an athlete with nothing better to do
This data was pulled from publicly available sources
This is only a case study, not a comprehensive investigation
CASE STUDY: Scottish National XC Champs 2011-2020
Reasons for choosing this race:
Encapsulates both mass participation and elite competition in one race
Recently equalised race distances for senior men and women, providing sufficient data pre- and post- distance change
Reasons for choosing this race:
Encapsulates both mass participation and elite competition in one race
Recently equalised race distances for senior men and women, providing sufficient data pre- and post- distance change
FACTORS BEING CONSIDERED:
Participation
Duration of race (w/ regards to officials & time for slowest runners)
Race excitement (w/ regards to spectators & race for title/medals)
Race quality (w/ regards to participation levels from elite to good club level athletes)
Participation
Duration of race (w/ regards to officials & time for slowest runners)
Race excitement (w/ regards to spectators & race for title/medals)
Race quality (w/ regards to participation levels from elite to good club level athletes)
Participation
Pre change average number of finishers:Men - 538
Women - 231
Total - 769
Post change average number of finishers:Men - 666
Women - 294
Total - 960
ChangeMen +23.8%
Women +27.3%
Total +24.8%
It might appear as if the change in distance has caused an increase in participation, however if we view the data as a whole we can see there is no significant increase in the upwards trend during the time of the race distance change
There appears to be a significant increase in participation in 2019, which could be attributed to the distance change with a lag effect, but due to the decrease in 2020, it seems more likely this was due to other external factors.
Race duration
Pre change average time of slowest 5 runnersMen - 74:20
Women - 49:58
Total - 124:18
Post change average number of finishers:Men - 69:58
Women - 71:51
Total - 141:49
ChangeMen -5.87%
Women + 43.8%
Total +14.1%
As we can see the decrease in race distance for men has limited effect on the race duration.
The distance increase for women has greatly increased the race duration, affecting the length of the schedule and the amount of time officials are required to be out on the course
Race excitementFor this factor, I have analysed the spread of the top 10 athletes in each race, as the closer this is the more exciting it is for the spectators.
Range, standard deviation
Pre change top 10 spread
Men - 1:51, 36.3s
Women - 2:06, 41.4s
Post change too 10 spread:
Men - 1:44, 32.4
Women - 2:35, 54.6
Change
Men -7s, -3.9s
Women +29s, +13.2s
Pre change top 10 spreadMen - 1:51, 36.3s
Women - 2:06, 41.4s
Post change too 10 spread:Men - 1:44, 32.4
Women - 2:35, 54.6
ChangeMen -7s, -3.9s
Women +29s, +13.2s
The distance decrease for men has marginally decreased the spread of the top 10
The distance increase for women has significantly spread the top 10, reducing the excitement of the challenge for the title and medals
Race qualityFor this factor I have analysed the spread of the top 50, top 100 and the entire field.
Top 50/100 spread can show us the difference in quality between the leaders and high level finishers
Total spread can show us the quality of the entire field
top 50 & top 100 range
Pre change:Men - 4:59, 7:15
Women - 5:18, 7:55
Post changeMen - 4:05, 5:43
Women - 6:53, 10:42
ChangeMen -18.1%, -21.1%
Women +29.9%, +35.2%
The decrease in race distance for men brings the top 50/100 significantly closer together
The increase in race distance for women’s takes the top 50/100 significantly further apart


This is obvious due to the change in race duration, but it is worth noting that the change has also made the spreads less equal between men and women
Total field spread(standard deviation and interquartile range)
Pre change (sd, iqr):Men - 6:44, 8:58
Women - 4:34, 6:18
Post change (sd, iqr)Men - 6:34, 8:53
Women - 7:05, 10:10
ChangeMen -2.46, -0.9%
Women +55.1%, +61.4%
The decrease in distance for men has had minimal effect on the total spread of the field
The increase in distance for women has significantly spread the entire field out.


It is worth noting that the change in distances has made the spread of the entire field more equal between men and women
Conclusion
In this case study, the equalising of race distances has been shown to:
Have little effect on participation levels
Increase event duration for officials
Reduce equality in the spread of the top 50/100
Increase equality in the spread of the entire field
In this case study, the equalising of race distances has been shown to:
Have little effect on participation levels
Increase event duration for officials
Reduce equality in the spread of the top 50/100
Increase equality in the spread of the entire field
Further considerations 
What this study does not consider:
How does this affect age groups other than senior?
How does this affect the athletes’ transition from junior to senior and retention levels through the age groups?
Is this data replicated in other events?

What this study does not consider:
How does this affect age groups other than senior?
How does this affect the athletes’ transition from junior to senior and retention levels through the age groups?
Is this data replicated in other events?
Closing remarks:This was not meant to provide a solution, and I am not qualified nor in a position to make this decision, it is meant to provide insight & evidence into the effects the decision may have.
Thanks for the inspiration & help with the data/analysis @CordyParker https://twitter.com/cordyparker/status/1352259722108395521
Read on Twitter