Lesson Notes By Weeks and Term v5 - Grade 11

Statistics – Week 5 focus

Download the Lessonotes Mobile South Africa app for faster lesson access on Android and iPhone.

Subject: Mathematics

Class: Grade 11

Term: Term 4

Week: 5

Theme: General lesson support

Lesson Video

This page supports the lesson note with a companion video and a short classroom-ready summary.

For class groups and homework, share this lesson page so learners also get the summary, objectives, and full lesson context.

Performance objectives

Lesson summary

This week, we delve deeper into the world of statistics, building upon the foundational knowledge you acquired in previous grades. Specifically, we'll be focusing on measures of dispersion and data representation. Statistics isn't just about numbers; it's a crucial tool for understanding and interpreting the world around us. From analyzing crime rates in different provinces to understanding the distribution of income in South Africa, statistics provides the framework for informed decision-making. Understanding dispersion allows us to see how spread out data is, and thus how reliable certain averages may be.

Lesson notes

2.1 Measures of Dispersion Measures of dispersion describe the spread or variability of data around a central value. A dataset with low dispersion indicates that the data points are clustered closely together, while high dispersion suggests the data is more spread out.

Range: The simplest measure of dispersion. It's calculated as the difference between the maximum and minimum values in a dataset. Range = Maximum value - Minimum value

Example:* Consider the ages of students in a class: 16, 17, 15, 18,

1

6. The range is 18 - 15 = 3 years.

Disadvantage:* Very sensitive to outliers (extreme values).

Interquartile Range (IQR): The range of the middle 50% of the data. It's calculated as the difference between the upper quartile (Q3) and the lower quartile (Q1). IQR = Q3 - Q1 Q1 is the value below which 25% of the data falls. Q3 is the value below which 75% of the data falls. To find Q1 and Q3, first order the data from smallest to largest. Then, find the median (Q2). Q1 is the median of the data below Q2, and Q3 is the median of the data above Q

2. If Q2 lies between two data points, the data points are not included in the calculation of Q1 and Q

3. Example:* Data: 2, 4, 6, 8, 10, 12,

1

4. Q2 =

8. Q1 = (2+4+6)/3=

4. Q3=(10+12+14)/3 =

1

2. Advantage:* Less sensitive to outliers than the range.

Semi-Interquartile Range: Half the interquartile range. Semi-IQR = IQR / 2 = (Q3 - Q1) / 2 Variance: A measure of how spread out the data is from the mean. It is the average of the squared differences from the mean. For a population, the variance (σ 2 ) is calculated as: σ 2 = Σ(x i - μ) 2 / N where x i is each data point, μ is the population mean, and N is the population size. For a sample, the variance (s 2 ) is calculated as: s 2 = Σ(x i - x̄) 2 / (n - 1) where x i is each data point, x̄ is the sample mean, and n is the sample size. We use (n-1) instead of n to provide a better estimate of the population variance from the sample.

Example:* Data: 4, 6, 8, 10,

1

2. Calculate the mean: (4+6+8+10+12)/5 = 8 Calculate the squared differences from the mean: (4-8) 2 =16, (6-8) 2 =4, (8-8) 2 =0, (10-8) 2 =4, (12-8) 2 =16 Sum the squared differences: 16+4+0+4+16 = 40 Divide by (n-1) = 4 (since this is a sample): 40/4 =

1

0. Therefore, the sample variance is

1

0. Standard Deviation: The square root of the variance. It is a measure of how spread out the data is from the mean, expressed in the same units as the data. For a population, the standard deviation (σ) is: σ = √(σ 2 ) For a sample, the standard deviation (s) is: s = √(s 2 )

Example:* Using the previous example where the sample variance was 10, the sample standard deviation is √10 ≈ 3.16. 2.2 Box and Whisker Plots A box and whisker plot (or boxplot) is a graphical representation of data based on the five-number summary: Minimum value Lower Quartile (Q1) Median (Q2) Upper Quartile (Q3) Maximum value The "box" represents the IQR (Q1 to Q3), with a line indicating the median (Q2). The "whiskers" extend from the box to the minimum and maximum values (or to the farthest data point within a certain range, with outliers indicated separately).

How to construct a box and whisker plot:* Order the data. Find the five-number summary. Draw a number line that spans the range of the data. Draw a box from Q1 to Q

3. Draw a line inside the box at the median (Q2). Draw whiskers from the box to the minimum and maximum values that are not outliers. Identify outliers (typically, values less than Q1 - 1.5IQR or greater than Q3 + 1.5IQR) and plot them as individual points beyond the whiskers.

Example:* Data: 5, 7, 9, 11, 13, 15, 17, 19,

2

1. Five-number summary: Minimum = 5, Q1 = 7, Median = 13, Q3 = 19, Maximum =

2

1. IQR = 19 - 7 = 12 Q1 - 1.5IQR = 7 - 1.512 = -11, Q3 + 1.5IQR = 19 + 1.512 =

3

7. There are no outliers. Draw a box and whisker plot with these values. 2.3 Comparing Datasets When comparing two or more datasets, use both measures of central tendency (mean, median, mode) and measures of dispersion (range, IQR, standard deviation). Central tendency tells you about the average or typical value in each dataset. Dispersion tells you how spread out the data is in each dataset. For example, two datasets might have the same mean, but one dataset could have a much larger standard deviation, indicating that the data is more spread out. This would suggest that the mean is a less reliable indicator for that dataset. 2.4 The Effect of Outliers An outlier is a data point that is significantly different from the other data points in a dataset. Outliers can have a significant effect on measures of central tendency and dispersion.

Mean:* Highly sensitive to outliers. A single outlier can significantly change the mean.

Median:* Less sensitive to outliers. The median is only affected by the number of values above and below it, not the values themselves.

Range:* Highly sensitive to outliers, as it depends only on the maximum and minimum values.