Data handling: summarising and interpreting data – Week 4 focus
Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.
Subject: Mathematical Literacy
Class: Grade 11
Term: Term 4
Week: 4
Theme: General lesson support
This page supports the lesson note with a companion video and a short classroom-ready summary.
For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.
This week, we delve into the crucial skill of summarising and interpreting data. In the modern world, we are constantly bombarded with information presented in the form of tables, charts, and graphs. Being able to understand and critically analyse this data is essential for making informed decisions in your personal life, participating effectively in your community, and succeeding in the workplace. From understanding your household budget to interpreting election results or evaluating the effectiveness of a public health campaign, the ability to work with data is indispensable. Data literacy is no longer a luxury, but a necessity in South Africa and globally.
2.1 Measures of Central Tendency: Measures of central tendency aim to find a "typical" value that represents the center of a dataset.
Mean: The arithmetic average of all values. To calculate the mean, sum all the values and divide by the total number of values.
Ungrouped Data: Mean = (Sum of all values) / (Number of values)
Grouped Data: Mean ≈ (Sum of (Midpoint of interval Frequency)) / (Total Frequency) For grouped data, we approximate the mean because we don't know the exact values within each interval. We use the midpoint of each interval as a representative value.
Median: The middle value when the data is arranged in ascending order.
Ungrouped Data: If the number of values is odd, the median is the middle value. If the number of values is even, the median is the average of the two middle values.
Grouped Data: The median lies within the "median class" (the interval containing the middle value). We can estimate the median using interpolation: Median ≈ L + [(n/2 - CF) / f] * w Where: L = Lower boundary of the median class n = Total frequency CF = Cumulative frequency of the class before the median class f = Frequency of the median class w = Class width Mode: The value that appears most frequently in the dataset.
Ungrouped Data: Simply identify the value that occurs most often.
Grouped Data: The "modal class" is the interval with the highest frequency. We can approximate the mode as the midpoint of the modal class.
Example 1: Calculating Mean, Median, and Mode (Ungrouped Data) Consider the following data representing the number of learners in 8 Grade 11 Mathematical Literacy classes in a school in Gauteng: 28, 32, 30, 25, 32, 29, 31,
3
2. Mean: (28 + 32 + 30 + 25 + 32 + 29 + 31 + 32) / 8 = 239 / 8 = 29.875 ≈ 30 learners.
Median: First, arrange the data in ascending order: 25, 28, 29, 30, 31, 32, 32,
3
2. Since there are 8 values (even), the median is the average of the 4th and 5th values: (30 + 31) / 2 = 30.5 learners.
Mode: The value 32 appears most frequently (3 times).
Therefore, the mode is 32 learners.
Example 2: Calculating Mean, Median, and Mode (Grouped Data) The following table shows the distribution of monthly income for a sample of 100 households in a rural village in KwaZulu-Natal: | Income (Rands) | Frequency | |---|---| | 0 - 1000 | 20 | | 1001 - 2000 | 35 | | 2001 - 3000 | 30 | | 3001 - 4000 | 10 | | 4001 - 5000 | 5 | Mean: | Income (Rands) | Frequency | Midpoint | Midpoint * Frequency | |---|---|---|---| | 0 - 1000 | 20 | 500 | 10000 | | 1001 - 2000 | 35 | 1500.5 | 52517.5 | | 2001 - 3000 | 30 | 2500.5 | 75015 | | 3001 - 4000 | 10 | 3500.5 | 35005 | | 4001 - 5000 | 5 | 4500.5 | 22502.5 | | Total | 100 | | 195040 | Mean ≈ 195040 / 100 = R1950.40 Median: n = 100, so n/2 =
5
0. We need to find the class containing the 50th value.
Cumulative Frequencies: 20, 55, 85, 95,
1
0
0. The median class is 1001 - 2000 (because the cumulative frequency reaches 55, which is greater than 50). L = 1000.5 (Lower boundary of the median class) CF = 20 (Cumulative frequency of the class before the median class) f = 35 (Frequency of the median class) w = 1000 (Class width) Median ≈ 1000.5 + [(50 - 20) / 35] 1000 = 1000.5 + (30/35) 1000 = 1000.5 + 857.14 = R1857.64 Mode: The modal class is 1001 - 2000 because it has the highest frequency (35). The approximate mode is the midpoint of this class: (1001 + 2000) / 2 = R1500.50 2.2 Measures of Dispersion: Measures of dispersion describe how spread out the data is.
Range: The difference between the highest and lowest values. Range = Maximum value - Minimum value. It's a simple measure but sensitive to outliers.
Interquartile Range (IQR): The difference between the upper quartile (Q3) and the lower quartile (Q1). IQR = Q3 - Q
1. It represents the range of the middle 50% of the data and is less sensitive to outliers than the range.
Quartiles: Divide the data into four equal parts. Q1 is the 25th percentile, Q2 is the median (50th percentile), and Q3 is the 75th percentile.
Example 3: Calculating Range and IQR Using the same data from Example 1 (number of learners in classes): 25, 28, 29, 30, 31, 32, 32,
3
2. Range: 32 - 25 = 7 learners IQR: Q1: The median of the lower half of the data (25, 28, 29, 30) is (28+29)/2 = 28.5 Q3: The median of the upper half of the data (31, 32, 32, 32) is (32+32)/2 = 32 IQR = 32 - 28.5 = 3.5 learners 2.3 Box and Whisker Plots: A box and whisker plot (or boxplot) is a visual representation of the data using five key values: Minimum value Q1 (Lower Quartile) Median (Q2) Q3 (Upper Quartile) Maximum value The "box" represents the IQR (the middle 50% of the data), and the "whiskers" extend to the minimum and maximum values (or to a certain distance, beyond which values are considered outliers). Boxplots are useful for comparing the distribution of two or more datasets. 2.4 Outliers: Outliers are values that are significantly different from the rest of the data. They can skew the mean and affect the interpretation of the data.