Summation
\displaystyle\sum_{i=1}^n
We use Sigma symbol to represent a summation.
Instead of saying:
x_1 + x_2 + x_3 + x_4 + x_5 + x_6
We can write:
\displaystyle\sum_{i=1}^6x_i
Measures of Spread
Five Number Summary – gives values for calculating the range and interquartile range.
- Minimum – the smallest number in the dataset.
- Q1 – The value such that 25% of the data fall below.
- Q2 – The value such that 50% of the data fall below.
- Q3 – The value such that 75% of the data fall below.
- Maximum – The largest value in the dataset.
Range – calculated as the difference between the maximum and the minimum.
range = maximum - minimum
IQR (Interquartile Range) – calculated as the difference between Q3 and Q1
IQR = Q_3 - Q_1
Steps to compute:
- Arrange data set from least to highest number.
- Get the lowest number as the minimum.
- Get the highest number as the maximum.
- Get the median/mean as Q2.
- Get the median/mean of the first data set as the Q1. (don’t include the median/mean of the entire data set Q2.)
- Get the median/mean of the second data set as the Q3. (don’t include the median/mean of the entire data set Q2.)
- Get the difference of the maximum and minimum as the range.
- Get the difference of the Q3 and Q1 as the IQR.
Examples
1, 5, 10, 3, 8, 12, 4, 1, 2, 8
1. 1, 1, 2, 3, 4, 5, 8, 8, 10, 12
2. Minimum = 1
3. Maximum = 12
4. Q2 = 4+5 = 9/2 = 4.5
5. Q1 = 2
6. Q3 = 8
7. Range = 12-1 = 11
8. IQR = 8-2 = 6
5, 10, 3, 8, 12, 4, 1, 2, 8
1. 1, 2, 3, 4, 5, 8, 8, 10, 12
2. Minimum = 1
3. Maximum = 12
4. Q2 = 5
5. Q1 = 2+3=5/2 = 2.5
6. Q3 = 8+10=18/2 = 9
7. Range = 12-1 = 11
8. IQR = 9-2.5 = 6.5
Box Plot – are useful for quickly comparing the spread of two data sets across some key metrics, like quartiles, maximum, and minimum.

- The beginning of the line to the left of the box and the end of the line to the right of the box represent the minimum and maximum values in a dataset.
- The visual distance between these markings is an indication of the range of the values.
- The box itself represents the IQR. The box begins at the Q1 value, ends at the Q3 value, and Q2, or the median, is represented by a line within the box.
Standard Deviation and Variance
Standard Deviation – on average, how much each point varies from the mean of the points.
Variance – average squared difference of each observation from the mean.
\sqrt {\frac 1 n \displaystyle\sum_{i=1}^n (x_i - \bar{x})^2}
How to Calculate Standard Deviation
Dataset=
10, 14, 10, 6
- Calculate the mean.
(\sum_{i=1}^4 x_i)/n \\ 10+14+10+6 \\ 40/n \\ 40/4 \\ =10
- Calculate the distance of each observation from the mean and square the value.
(x_i - \bar{x})^2 = \\ (10-10)^2 = 0^2 = 0 \\ (14-10)^2 = 4^2 = 16 \\ (10-10)^2=0^2=0 \\ (6-10)^2=-4^2=16
- Calculate the variance, the average squared difference of each observation from the mean.
\sqrt {\frac 1 n \displaystyle\sum_{i=1}^n (x_i - \bar{x})^2} \\ (0+16+0+16)/4\\32/4\\=8
The variance is 8.
- Calculate the standard deviation, the square root of the variance.
\sqrt 8 \\ =2.83
The standard deviation is 2.83.
Sample Problem
Dataset
1, 5, 10, 3, 8, 12, 4
- Mean – 6.14
(\sum_{i=1}^7 x_i)/7 \\ 1+5+10+3+8+12+4=43\\43/7\\=6.14
- Variance
(x_1-\bar{x})^2\\(1-6.14)^2=-5.14^2=26.42\\(5-6.14)^2=-1.14^2=1.30\\(10-6.14)^2=3.86^2=14.90\\(3-6.14)^2=-3.14^2=9.86\\(8-6.14)^2=1.86^2=3.46\\(12-6.14)^2=5.86^2=34.34\\(4-6.14)^2=-2.14^2=4.58\\94.86/7\\=13.55
- Standard Deviation
\sqrt {13.55} \\ =3.68
Leave a Reply