Home > Essay examples > 50m Freestyle Swimmers’ Performance: Analyzing Times, Heights, and Weights at the 2012 London Olympics

Essay: 50m Freestyle Swimmers’ Performance: Analyzing Times, Heights, and Weights at the 2012 London Olympics

Essay details and download:

  • Subject area(s): Essay examples
  • Reading time: 7 minutes
  • Price: Free download
  • Published: 26 February 2023*
  • Last Modified: 22 July 2024
  • File format: Text
  • Words: 1,940 (approx)
  • Number of pages: 8 (approx)

Text preview of this essay:

This page of the essay has 1,940 words.



1. Statement of Task

The aim of this paper is to investigate the relationship between the times of a 50m freestyle swimmer, looking specifically at the top 50m freestyle swimmers at the 2012 London Olympics, the heights and the weights of the swimmer. I chose this topic due to my fascination with the sport and my history as a swimmer over the last few years. I competed in a lot of local and school swimming competitions and my fastest stroke was always freestyle.

My data is collected from the internet as I chose to look specifically at the top 30 50m freestyle swimmers at the London 2012 Olympics. The data collected includes the Times (s), Heights (cm) and competition Weights (kg) (competition weight being the weight at which the individual competed in the 2012 Olympics).

The rest of the paper will be split into two sections which I measure in order to discover any relationships. The first section will be the times and heights and the second section will be the times and weights. Firstly I plot scatter diagrams to show any observable correlation if present and follow that by calculating the Pearson Correlation coefficient and the regression line and as a result the regression equation. The equation will allow predictions to be made. I test its reliability by finding the percentage error. I carry out a chi-squared test  to further support my findings.  I repeat the process one more time for the Times and Weights and this will help draw a conclusion on how performance is affected by the times and weights of a swimmer.

2. Raw Data

3. Times and Heights of swimmers

  3.1 Scatter Plots

I’m presenting the scatter in order to help visualise a relationship if one is present. If a trend is in fact present the scatter plot will lead to drawing a line of best fit.

It is obvious to see that visually there is near to no trend between the Heights and the Times of the 30 swimmers. The data seems to be fairly randomly spread out. The calculation of Pearson Correlation Coefficient and the regression equation will allow for a more accurate investigation.

   3.2 Pearson’s correlation coefficient and the regression equation

I use the correlation coefficient to conclude whether there is a strong linear relationship between the two variables in question. In the case a relationship is present it will allow me to draw a line of best fit and consequently predict results. The formula used to calculate the correlation coefficient is:

x= time (s)

y= height (cm)

Using the GDC TI-84 Plus at 2 Var-Stats I calculated these values:

After using the GDC’s option of LinReg it gave the value of r as 0.0731217558. The difference between the two r values is due to the rounding used when calculating the value manually. Because this value calculates the regression value without having to round it is more accurate and was used instead of the value above. The value tells us that only 7.3% of the variation in the dependent variable can be explained in the independent variable. In addition it leads to conclude that a regression equation will not be appropriate due to the r value being so close to 0 sodeeming no correlation.

3.3 Chi-Squared Test

H0- The Heights of the 50m freestyle swimmers are independent of the Times of the swimmers

H1- The Heights of the 50m freestyle swimmers are dependent with the Times of the swimmers

In working out the chi-squared tests for the Times and Heights I workout appropriate groups using the means of the data. I split the table into “above-mean” and “below-mean”. This led to the creation of the Observed frequencies table for Heights and Times.

The calculated chi-squared value was found to be 0.133 to 3 decimal places. When calculating the expected values it was decided to leave it as a fraction instead of a decimal. A TI-84 GDC was used to calculate the Chi2 value, the p-value and the degrees of freedom:

chi-squared value- 0.133333

p-value- 0.715

df- 1

Chi-Squared Test Formula

The chi-squared critical value at 1 degree of freedom, at 0.05,  is as is shown in the table 3.841. The calculated chi-squared value came to 0.133 which is lower than the critical value of 3.841 so the null hypothesis is accepted.The p-value of 0.715 is greater than the 0.05 from the adjacent table which also helps accept the null hypothesis. Therefore, I can conclude from the data there is no relationship between the Heights and Times of the swimmers.

4. Times and Weights of swimmers

    4.1 Scatter Plots

Unlike with the scatter plot representing Times and Heights this one shows a slight observable trend sloping upwards indicating that as the weights increase so do the times. The diagram only presents a weak positive correlation with some noticeable outliers.The strength of the relationship will be more accurately discussed in the regression equation.

 

4.2 Pearson’s correlation coefficient and the regression equation

In order to calculate the correlation coefficient the formula I use is:

Using the GDC TI-84 Plus at 2 Var-Stats I was calculated the following values:

After using the GDC’s option of LinReg It gave the value of r as 0.63420340571. The difference between the two r values is due to the rounding used when calculating the value manually.Because this value calculates the regression value without having to round it is more accurate soI shall use it instead of the value above. The value tells us that only 63.42% of the variation in the dependent variable can be explained in the independent variable. In addition it leads to conclude that a regression equation would be appropriate due to the value deeming a moderate positive correlation. With the values above and Sx=0.3003 and Sy=5.908 (these values were found manually using the 2nd Var Stats option on the GDC) I find the regression equation to find the line of best fit.

Least squares regression equation:

   4.3 Percentage error

I extrapolate data and compare the extrapolated values, actual values and the interpolated values. To start off I compare  a “y” value to its actual recorded value. If i look at the Times and choose the 22.43 seconds as the x value, it has a weight of 85 kg as its recorded y value. I use my formula and replace the x with 22.43:

y=12.477x-197.11

y=12.477(22.43)-197.11

y=279.86-197.11

y=82.75

The value is extremely close to the recorded y value. Because it's not exactly the same I find the percentage error between them.

error=estimated value-true value

error=82.75-80

error=2.75

Now that I have found the error I can substitute it into the percentage error equation:

percentage error=erroractual value100%

percentage error=2.7580100%

percentage error=3.4375%

The real and the predicted value have a very strong relationship. Looking at the graph we can see that if we chose 21.80 as the x value, its y recorded value would be 85 kg. If we replace the x in the formula with 21.80:

y=12.477x-197.11

y=12.477(21.80)-197.11

y=271.99-197.11

y=74.89

The values are not the same and the percentage error between the two values seems to be:

error=estimated value-true value

error=74.89-85

error= -10.11

Following the same procedure as before I substitute the error into the percentage error formula:

percentage error=erroractual value100%

percentage error=-10.1185100%

percentage error= -11.8958%

My first percentage error is fairly small which means that the linear regression equation is quite good which means that there is a strong correlation between the data and the equation. If we look at the second point I chose we can note that because my estimated value was lower than my true value the percentage error is negative. I feel as if it would be reasonable to assume that I could extrapolate with data above and below my ranges. For example my smallest time value is 21.77 seconds. If it try and predict the value for 20 seconds and use that value as an “x” substitute I would end up with 52.43 kg/s. If we reference back to the raw data section we can see that this is below any of the recorded data.

 If we look at my biggest time value of 22.8 seconds and try and predict the value for 23.2 seconds using it as the “x” substitute it would result in 92.36 kg/s which is still within the data that we have recorded above. This means that actually it is not reasonable to to extrapolate with data above my ranges. Even though extrapolating with data above my ranges is not possible, the interpolation and line of regression have indicated towards a fairly strong correlation between the two factors

4.4 Chi-Squared Test

H0- The Weights of the 50m freestyle swimmers are independent of the Times of the swimmers

H1- The Weights of the 50m freestyle swimmers are dependent with the Times of the swimmers

Using the same method as with the Heights and Times section I calculate the observed values.

The calculated chi-squared value was found to be 7.033 to 3 decimal places. A TI-84 GDC was used to calculate the Chi2 value, the p-value and the degrees of freedom:

chi-squared value- 7.033492823

p-value- 0.0079

df- 1

Chi-Squared Test for Independence Formula

The chi-squared critical value at 1 degree of freedom at 0.05 is 3.841 (shown in the critical value table from the chi-squared test for the Heights and Times. The calculated chi-squared value came to 7.033 which is higher than the critical value of 3.841 sothe null hypothesis is rejected. The p-value of 0.0079 is smaller than the 0.05 from the which also reinforces the rejection of the null hypothesis. Therefore, I can conclude from the data there is a significant relationship between the Weights and Times of the swimmers.

5. Conclusion

I was surprised with the conclusion that was drawn when investigating the relationship between the heights and times of the various swimmers. There was no linear relationship between the heights and the times of the swimmers and this is seen by the very start within the scatter plot with the different data being even spread out throughout the graph. This proves that the two variables are therefore independent from each other. The chi-squared test further proved this point as the calculated chi-squared value was calculated to be lower than the critical value and so meant that the null hypothesis can be accepted. This was not the case when it came to investigating the weight and the times. There was a linear relationship between the two. The correlation coefficient was 0.6342 which essentially means that 63.42% of the variation in the dependent variable can be explained in the independent variable. 0.6342 is deemed to be a moderate positive correlation. In my opinion this was strong enough to carry out a regression equation. From the most recent calculations it seems as if the correlation coefficient is not strong enough to be confident to some extent in extrapolating. I decided to carry out a chi-squared test. The calculated value was 7.033 which was higher than the critical value and so meant that the null hypothesis is rejected- the weights of the swimmers are in fact dependent of the times of the swimmers.

I obtained my data from different websites and throughout this investigation I believe that there was one main limitation and that was the amount of data I collected. The amount of data collected made me realise how my ranges of data had been reduced which means that the accuracy so reliability of my investigation are also reduced. As an improvement I should spend more additional time on collecting more data and instead use a more comprehensive and improved approach such as compare heights and weights of swimmers to their times but instead of using top 30 50 m freestyle swimmers, an additional number of strokes could be used instead. As a result, I compared how heights and weights affect swimmers that compete in different strokes differently.

About this essay:

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, 50m Freestyle Swimmers’ Performance: Analyzing Times, Heights, and Weights at the 2012 London Olympics. Available from:<https://www.essaysauce.com/essay-examples/2018-12-2-1543757862/> [Accessed 16-04-26].

These Essay examples have been submitted to us by students in order to help you with your studies.

* This essay may have been previously published on EssaySauce.com and/or Essay.uk.com at an earlier date than indicated.

NB: Our essay examples category includes User Generated Content which may not have yet been reviewed. If you find content which you believe we need to review in this section, please do email us: essaysauce77 AT gmail.com.