Data handling: summarising and interpreting data – Week 5 focus
Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.
Subject: Mathematical Literacy
Class: Grade 11
Term: Term 4
Week: 5
Theme: General lesson support
This page supports the lesson note with a companion video and a short classroom-ready summary.
For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.
Data handling is a crucial skill for navigating the modern world. We are constantly bombarded with information, from news articles to social media posts to statistics in reports. Being able to summarise and interpret data effectively allows us to make informed decisions, understand trends, and critically evaluate claims. In South Africa, this skill is particularly important for understanding issues like unemployment rates, crime statistics, healthcare access, and educational outcomes. Understanding this data allows us as citizens to participate more effectively in discussions around these issues.
2.1 Measures of Central Tendency: These values represent the 'center' of a dataset.
Mean (Average): The sum of all values divided by the number of values.
Formula: Mean = (Sum of all values) / (Number of values). For grouped data, we use the midpoint of each class interval and multiply by the frequency.
Median: The middle value when the data is arranged in ascending order. If there's an even number of values, the median is the average of the two middle values. For grouped data, we use the formula: Median = L + [(N/2 - CF) / f] w, where L is the lower boundary of the median class, N is the total frequency, CF is the cumulative frequency of the class before the median class, f is the frequency of the median class, and w is the class width.
Mode: The value that appears most frequently in the dataset. For grouped data, it is the modal class (the class with the highest frequency). It is possible to have no mode, one mode (unimodal), or multiple modes (bimodal, trimodal, etc.). Example 1 (Mean, Median, Mode - Ungrouped Data): A tuck shop sells the following number of cool drinks each day for a week: 25, 30, 28, 25, 32, 27,
2
9. Mean: (25 + 30 + 28 + 25 + 32 + 27 + 29) / 7 = 246 / 7 = 29.43 (approximately)
Median: First, order the data: 25, 25, 27, 28, 29, 30,
3
2. The middle value is
2
8. So, the median is
2
8. Mode: The number 25 appears twice, more than any other number. So, the mode is
2
5. Example 2 (Mean - Grouped Data): A survey of taxi fares between Johannesburg and Pretoria yielded the following data: | Fare (R) | Frequency | |---|---| | 50 - 59 | 5 | | 60 - 69 | 12 | | 70 - 79 | 8 | | 80 - 89 | 3 | | 90 - 99 | 2 | To calculate the mean, we first find the midpoint of each class interval: | Fare (R) | Frequency | Midpoint | Midpoint * Frequency | |---|---|---|---| | 50 - 59 | 5 | 54.5 | 272.5 | | 60 - 69 | 12 | 64.5 | 774 | | 70 - 79 | 8 | 74.5 | 596 | | 80 - 89 | 3 | 84.5 | 253.5 | | 90 - 99 | 2 | 94.5 | 189 | Sum of (Midpoint * Frequency) = 272.5 + 774 + 596 + 253.5 + 189 = 2085 Total Frequency = 5 + 12 + 8 + 3 + 2 = 30 Mean = 2085 / 30 = R69.50 2.2 Measures of Dispersion: These values describe the spread or variability of the data.
Range: The difference between the highest and lowest values in the dataset. Range = Maximum value - Minimum value.
Interquartile Range (IQR): The difference between the upper quartile (Q3) and the lower quartile (Q1). IQR = Q3 - Q
1. Q1 is the median of the lower half of the data, and Q3 is the median of the upper half of the data. If the median falls within the dataset, exclude it when calculating Q1 and Q
3. Percentiles: Divide the data into 100 equal parts. The nth percentile is the value below which n% of the data falls. For example, the 25th percentile is Q1, and the 75th percentile is Q
3. Example 3 (Range, IQR - Ungrouped Data): Using the cool drink sales data from Example 1: 25, 30, 28, 25, 32, 27,
2
9. Range: Maximum value (32) - Minimum value (25) = 7 IQR: First, order the data: 25, 25, 27, 28, 29, 30,
3
2. Q1 (Lower Quartile): The median of the lower half (25, 25, 27) is
2
5. Q3 (Upper Quartile): The median of the upper half (29, 30, 32) is
3
0. IQR = Q3 - Q1 = 30 - 25 = 5 2.3 Box-and-Whisker Plots: A visual representation of data using quartiles. It displays the minimum value, Q1, median (Q2), Q3, and maximum value. Outliers can also be represented. The 'box' represents the IQR (Q1 to Q3). The line inside the box represents the median (Q2). The 'whiskers' extend from the box to the minimum and maximum values (or to the furthest data points within a certain range, excluding outliers).
Creating a Box-and-Whisker Plot: Order the data. Find the minimum value, Q1, median, Q3, and maximum value. Draw a number line that covers the range of your data. Mark the five key values above the number line. Draw the box from Q1 to Q
3. Draw a line inside the box at the median. Draw the whiskers from the box to the minimum and maximum values.
Example 4 (Box-and-Whisker Plot): Using the cool drink sales data from Example 1 and 3: 25, 30, 28, 25, 32, 27,
2
9. We have already found: Minimum = 25, Q1 = 25, Median = 28, Q3 = 30, Maximum =
3
2. Draw a number line from 20 to
3
5. Mark the five key values and construct the box-and-whisker plot. This plot helps visualise the spread of the data; for example, we can immediately see the range and the IQR. 2.4 Interpreting Data and Identifying Biases: It is important to critically analyse data and be aware of potential biases.
Sample Size: Is the sample size large enough to be representative of the population? A small sample size might not accurately reflect the overall population.
Sampling Method: Was the data collected randomly or was there a specific selection process that could introduce bias? For example, surveying only people in affluent areas might not accurately reflect the income distribution of the entire country.
Data Presentation: How is the data presented? Graphs and charts can be manipulated to exaggerate or downplay certain trends.