Statistics – Week 1 focus
Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.
Subject: Mathematics
Class: Grade 11
Term: Term 4
Week: 1
Theme: General lesson support
This page supports the lesson note with a companion video and a short classroom-ready summary.
For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.
Statistics is the science of collecting, organizing, analyzing, interpreting, and presenting data. It plays a crucial role in understanding trends, making informed decisions, and solving problems in various fields. In the South African context, statistics is vital for analyzing socio-economic data, monitoring public health, evaluating educational outcomes, and informing policy decisions related to issues like poverty, inequality, and unemployment. For example, understanding crime statistics can help allocate resources to improve safety in communities. Analyzing school performance data informs interventions to improve the quality of education.
2.1 Measures of Central Tendency Measures of central tendency aim to find a single value that represents the 'center' of a dataset.
Mean (Average): The sum of all values divided by the number of values.
Ungrouped Data: Mean (x̄) = Σx / n, where Σx is the sum of all data points and n is the number of data points.
Grouped Data: Mean (x̄) = Σ(f x) / Σf, where f is the frequency of each class interval and x is the midpoint of each class interval.
Example: Imagine we want to find the average monthly cell phone bill of 7 students.
The bills are: R50, R80, R120, R150, R90, R70, R100. x̄ = (50 + 80 + 120 + 150 + 90 + 70 + 100) / 7 = R660 / 7 ≈ R94.29 Grouped Data
Example: Consider the following table showing the number of hours students spend studying per week. | Hours | Frequency (f) | Midpoint (x) | f * x | | ------ | ------------- | ------------ | ----- | | 0 - 4 | 5 | 2 | 10 | | 5 - 9 | 12 | 7 | 84 | | 10 - 14 | 8 | 12 | 96 | | 15 - 19 | 3 | 17 | 51 | Σf = 5 + 12 + 8 + 3 = 28 Σ(f * x) = 10 + 84 + 96 + 51 = 241 Mean (x̄) = 241 / 28 ≈ 8.61 hours Median: The middle value when the data is arranged in ascending order.
Ungrouped Data: If n is odd, the median is the (n+1)/2 th value. If n is even, the median is the average of the n/2 th and (n/2 + 1) th values.
Grouped Data: Use the formula: Median = L + [(n/2 - CF) / f] w, where L is the lower boundary of the median class, n is the total frequency, CF is the cumulative frequency of the class before the median class, f is the frequency of the median class, and w is the class width.
Example: Using the cell phone bill example from above (R50, R80, R120, R150, R90, R70, R100). First, order the data: R50, R70, R80, R90, R100, R120, R
1
5
0. The median is R90 (the middle value). Grouped Data
Example: Using the study hours table above. Total frequency (n) = 28, therefore n/2 =
1
4. The median class is the class where the cumulative frequency exceeds 14 for the first time.
Cumulative frequencies: 5, 17, 25,
2
8. Therefore the median class is 5 - 9 hours. L = 4.5 (lower boundary of 5 - 9 class) CF = 5 (cumulative frequency of the previous class) f = 12 (frequency of the 5-9 class) w = 5 (class width: 9-5 = 4, then add 1) Median = 4.5 + [(14 - 5) / 12] 5 = 4.5 + (9/12)5 = 4.5 + 3.75 = 8.25 hours Mode: The value that appears most frequently in a dataset.
Ungrouped Data: Simply identify the most frequent value.
Grouped Data: The modal class is the class with the highest frequency. An estimate of the mode within that class can be calculated. It is usually sufficient to identify the modal class.
Example: Considering shoe sizes of learners in a class: 6, 7, 7, 8, 8, 8, 9, 9,
1
0. The mode is 8 (appears three times). Grouped Data
Example: In the study hours example, the modal class is 5-9 hours as it has the highest frequency (12). 2.2 Measures of Dispersion Measures of dispersion describe how spread out the data is.
Range: The difference between the highest and lowest values in the dataset. Range = Maximum Value - Minimum Value. It is highly affected by outliers.
Example: For the cell phone bills (R50, R70, R80, R90, R100, R120, R150), the range is R150 - R50 = R
1
0
0. Quartiles: Divide the ordered dataset into four equal parts.
Q1 (First Quartile or Lower Quartile): The value below which 25% of the data lies.
Q2 (Second Quartile): The median.
Q3 (Third Quartile or Upper Quartile): The value below which 75% of the data lies.
Ungrouped Data: The position of Q1 = (n+1)/4 and Q3 = 3(n+1)/
4. Example: For the cell phone bills (R50, R70, R80, R90, R100, R120, R150), n =
7. Q1 position = (7+1)/4 =
2. Q1 = R70 Q3 position = 3(7+1)/4 =
6. Q3 = R
1
2
0. Interquartile Range (IQR): The difference between the upper and lower quartiles. IQR = Q3 - Q
1. It represents the spread of the middle 50% of the data and is less affected by outliers.
Example: Using the quartiles calculated above (Q1 = R70, Q3 = R120), the IQR = R120 - R70 = R
5
0. Semi-Interquartile Range: Half of the interquartile range. SIQR = (Q3 - Q1) /
2. It gives an idea of the average spread around the median.
Example: Using the IQR above, SIQR = R50 / 2 = R
2
5. Variance: The average of the squared differences from the mean. It measures the overall spread of the data.
Ungrouped Data: Variance (σ²) = Σ(x - x̄)² / n Grouped Data: Variance (σ²) = Σ[f (x - x̄)²] / Σf
Example: Using cell phone data (R50, R70, R80, R90, R100, R120, R150) and the calculated mean of R94.29: σ² = [ (50-94.29)² + (70-94.29)² + (80-94.29)² + (90-94.29)² + (100-94.29)² + (120-94.29)² + (150-94.29)² ] / 7 σ² ≈ 997.96 / 7 ≈ 142.56 Standard Deviation: The square root of the variance. It represents the typical deviation of data points from the mean. It is expressed in the same units as the original data.