Solved Exercises on Comparing Data Sets in Grade 7

Master comparing distributions: analyzing centers, spreads, shapes, and outliers through these 10 detailed exercises with visual learning tools and statistical analysis techniques.

Solution: Exercises 1 to 3
1 Center Comparison
Exercise 1
Compare the centers of these two data sets: Set A: 10, 12, 14, 16, 18; Set B: 8, 10, 12, 14, 16. Which set has a higher center?
Definition:

Center of distribution: A measure of the typical value in a data set, usually the mean or median.

Method:
  1. Calculate the mean of each data set
  2. Compare the means to determine which center is higher
  3. Alternatively, find the median if data is ordered
Step 1: Calculate mean of Set A

Mean A = (10 + 12 + 14 + 16 + 18) ÷ 5 = 70 ÷ 5 = 14

Step 2: Calculate mean of Set B

Mean B = (8 + 10 + 12 + 14 + 16) ÷ 5 = 60 ÷ 5 = 12

Step 3: Compare the centers

Mean A = 14, Mean B = 12, so Set A has a higher center

Set A: (10+12+14+16+18) ÷ 5
= 14
Set B: (8+10+12+14+16) ÷ 5
= 12
Comparison: 14 vs 12
Set A is higher
Stat Set A Set B Mean 14 12 Median 14 12
Set A has higher center (mean = 14)
Final answer:

Set A has a higher center with a mean of 14 compared to Set B's mean of 12.

Applied rules:

Mean calculation: Sum of values ÷ Number of values

Center comparison: Higher mean indicates higher center

Central tendency: Mean represents typical value

2 Spread Comparison
Exercise 2
Compare the spread of these two data sets: Set C: 5, 10, 15, 20, 25; Set D: 10, 12, 14, 16, 18. Which set has a larger spread?
Definition:

Spread of distribution: How spread out the data values are from the center, measured by range or interquartile range.

Step 1: Calculate range of Set C

Range C = 25 - 5 = 20

Step 2: Calculate range of Set D

Range D = 18 - 10 = 8

Step 3: Compare the ranges

Range C = 20, Range D = 8, so Set C has a larger spread

Set C: Max - Min = 25 - 5
= 20
Set D: Max - Min = 18 - 10
= 8
Comparison: 20 vs 8
Set C has larger spread
Measure Set C Set D Range 20 8 Mean 15 14
Set C has larger spread (range = 20)
Final answer:

Set C has a larger spread with a range of 20 compared to Set D's range of 8.

Applied rules:

Range calculation: Maximum - Minimum

Spread comparison: Larger range indicates greater spread

Variability measure: Range shows overall spread

3 Shape Comparison
Exercise 3
Compare the shapes of these data sets: Set E: 2, 3, 4, 5, 6, 7, 8; Set F: 1, 2, 2, 3, 8, 8, 9. Describe the shape of each and how they differ.
Definition:

Shape of distribution: The pattern of how data is distributed, describing symmetry, skewness, and clustering.

Step 1: Analyze Set E

Set E: 2, 3, 4, 5, 6, 7, 8 - Values are evenly spaced, symmetrical around the center

Step 2: Analyze Set F

Set F: 1, 2, 2, 3, 8, 8, 9 - Values cluster at low and high ends, creating a bimodal shape

Step 3: Compare the shapes

Set E is symmetric and uniform, Set F is bimodal with clustering at extremes

Set E: 2, 3, 4, 5, 6, 7, 8
Symmetric, uniform
Set F: 1, 2, 2, 3, 8, 8, 9
Bimodal, clustered
Comparison: Shape difference
Set E symmetric, Set F bimodal
Set Data Points Shape E 2, 3, 4, 5, 6, 7, 8 Symmetric F 1, 2, 2, 3, 8, 8, 9 Bimodal
Set E: Symmetric, Set F: Bimodal
Final answer:

Set E has a symmetric, uniform distribution, while Set F has a bimodal distribution with clustering at the extremes.

Applied rules:

Shape analysis: Examine clustering and symmetry of values

Pattern recognition: Look for peaks, gaps, and overall distribution

Distribution types: Symmetric, skewed, bimodal, uniform

Solution: Exercises 4 to 6
4 Outlier Detection
Exercise 4
Identify any outliers in these data sets: Set G: 12, 14, 15, 16, 17, 18, 50; Set H: 8, 9, 10, 11, 12, 13, 14.
Definition:

Outlier: A data point that is significantly different from other values in the data set.

Step 1: Examine Set G

Set G: 12, 14, 15, 16, 17, 18, 50 - The value 50 is much higher than the others

Step 2: Examine Set H

Set H: 8, 9, 10, 11, 12, 13, 14 - All values are close together

Step 3: Identify outliers

Set G has an outlier at 50, Set H has no outliers

Set G: 12, 14, 15, 16, 17, 18, 50
Outlier: 50
Set H: 8, 9, 10, 11, 12, 13, 14
No outliers
Conclusion: Outlier impact
Set G mean increased significantly
Set Data Points Outliers Effect on Mean G 12, 14, 15, 16, 17, 18, 50 50 Increases mean H 8, 9, 10, 11, 12, 13, 14 None Normal mean
Set G: Outlier at 50, Set H: No outliers
Final answer:

Set G has an outlier at 50, which significantly affects the mean. Set H has no outliers.

Applied rules:

Outlier identification: Values that are much higher or lower than others

Impact on statistics: Outliers can significantly affect mean

Visual inspection: Look for gaps or extreme values

5 Comparing Means and Medians
Exercise 5
Compare the mean and median of these data sets: Set I: 5, 10, 15, 20, 25; Set J: 1, 2, 3, 4, 50. How do outliers affect the measures of center?
Definition:

Robust measure: A statistic that is not heavily affected by outliers, like the median.

Step 1: Calculate mean and median for Set I

Mean I = (5+10+15+20+25) ÷ 5 = 15, Median I = 15

Step 2: Calculate mean and median for Set J

Mean J = (1+2+3+4+50) ÷ 5 = 12, Median J = 3

Step 3: Compare and analyze

Set I: Mean = Median (symmetric), Set J: Mean ≠ Median (affected by outlier)

Set I: Mean = (5+10+15+20+25) ÷ 5
= 15
Set I: Median (middle value)
= 15
Set J: Mean = (1+2+3+4+50) ÷ 5
= 12
Set J: Median (middle value)
= 3
Set Data Mean Median Outlier Present I 5, 10, 15, 20, 25 15 15 No J 1, 2, 3, 4, 50 12 3 Yes (50)
Set I: Mean = Median = 15, Set J: Mean (12) ≠ Median (3)
Final answer:

Set I has equal mean and median due to symmetry. Set J's mean (12) is much higher than its median (3) due to the outlier.

Applied rules:

Mean sensitivity: Affected by extreme values

Median robustness: Less affected by outliers

Skewed distributions: Mean pulled toward outlier

6 Interquartile Range Comparison
Exercise 6
Compare the interquartile ranges of these data sets: Set K: 10, 20, 30, 40, 50, 60, 70; Set L: 15, 25, 35, 45, 55, 65, 75.
Definition:

Interquartile range (IQR): The range of the middle 50% of the data, calculated as Q3 - Q1.

Step 1: Find Q1 and Q3 for Set K

Q1 = 20, Q3 = 60, IQR = 60 - 20 = 40

Step 2: Find Q1 and Q3 for Set L

Q1 = 25, Q3 = 65, IQR = 65 - 25 = 40

Step 3: Compare the IQRs

Both sets have the same IQR of 40, indicating similar spread in the middle half

Set K: Q1 = 20, Q3 = 60
IQR = 40
Set L: Q1 = 25, Q3 = 65
IQR = 40
Comparison: Both IQRs
Equal at 40
Set Data Q1 Q3 IQR K 10, 20, 30, 40, 50, 60, 70 20 60 40 L 15, 25, 35, 45, 55, 65, 75 25 65 40
Both sets have equal IQR = 40
Final answer:

Both data sets have the same interquartile range of 40, indicating that the middle 50% of both distributions have the same spread.

Applied rules:

Quartile calculation: Q1 = median of lower half, Q3 = median of upper half

IQR formula: IQR = Q3 - Q1

Spread measure: IQR represents middle 50% of data

Distribution Comparison Visual Guide
Mean, Median, Mode, Range, IQR
Key Statistical Measures
Mean
Average value
Median
Middle value
Mode
Most frequent value
Range
Max - Min
Distribution Comparison Process:
Step 1: Calculate center measures (mean, median)
Step 2: Calculate spread measures (range, IQR)
Step 3: Analyze shape (symmetry, skewness)
Step 4: Identify outliers
Step 5: Compare the distributions systematically
Tip 1: Always compare the same statistical measures between distributions.
Tip 2: Use both center and spread measures for complete comparison.
Tip 3: Consider how outliers affect the mean differently than the median.
Common errors: Comparing different measures, ignoring outliers, focusing only on center.
Success strategies: Systematic approach, multiple measures, visual inspection.
Essential concepts:

• Center: Mean, median, mode indicate typical values

• Spread: Range, IQR indicate variability

• Shape: Symmetry, skewness, modality describe pattern

• Outliers: Extreme values that stand apart

Solution: Exercises 7 to 10
7 Real-World Data Comparison
Exercise 7
Compare the test scores of two classes: Class A: 78, 82, 85, 88, 90, 92, 95; Class B: 65, 70, 75, 80, 85, 90, 95. Which class performed better overall?
Definition:

Performance comparison: Evaluating which group has better overall results using statistical measures.

Step 1: Calculate mean for Class A

Mean A = (78+82+85+88+90+92+95) ÷ 7 = 610 ÷ 7 ≈ 87.1

Step 2: Calculate mean for Class B

Mean B = (65+70+75+80+85+90+95) ÷ 7 = 560 ÷ 7 = 80

Step 3: Compare centers and spreads

Class A: Mean ≈ 87.1, Range = 17; Class B: Mean = 80, Range = 30

Class A mean: (78+82+...+95) ÷ 7
≈ 87.1
Class B mean: (65+70+...+95) ÷ 7
= 80
Comparison: 87.1 vs 80
Class A performed better
Class Mean Median Range Overall Performance A 87.1 88 17 Better B 80 80 30 Lower
Class A performed better (mean = 87.1 vs 80)
Final answer:

Class A performed better overall with a mean score of approximately 87.1 compared to Class B's mean of 80.

Applied rules:

Mean comparison: Higher mean indicates better performance

Spread consideration: Class A also has less variability

Performance evaluation: Compare multiple measures

8 Temperature Data Analysis
Exercise 8
Compare temperatures in two cities: City X: 65°, 68°, 70°, 72°, 75°; City Y: 55°, 60°, 70°, 80°, 85°. Which city has more consistent temperatures?
Definition:

Consistency: How close the data values are to each other, measured by the spread of the distribution.

Step 1: Calculate range for City X

Range X = 75° - 65° = 10°

Step 2: Calculate range for City Y

Range Y = 85° - 55° = 30°

Step 3: Compare consistency

City X has a smaller range (10°) than City Y (30°), so City X has more consistent temperatures

City X: 65° to 75°
Range = 10°
City Y: 55° to 85°
Range = 30°
Consistency comparison
City X more consistent
City Temperatures Mean Range Consistency X 65°, 68°, 70°, 72°, 75° 70° 10° More consistent Y 55°, 60°, 70°, 80°, 85° 70° 30° Less consistent
City X: More consistent (range = 10°)
Final answer:

City X has more consistent temperatures with a range of 10° compared to City Y's range of 30°.

Applied rules:

Consistency measure: Smaller range indicates more consistency

Spread comparison: Compare ranges directly

Equal means: Despite same mean, spreads differ significantly

9 Speed Comparison Analysis
Exercise 9
Two runners' speeds in mph: Runner A: 5, 6, 7, 8, 9; Runner B: 4, 5, 6, 7, 10. Compare their average speeds and consistency.
Definition:

Performance metrics: Average speed and consistency are key measures of athletic performance.

Step 1: Calculate mean speed for each runner

Runner A: (5+6+7+8+9) ÷ 5 = 35 ÷ 5 = 7 mph

Runner B: (4+5+6+7+10) ÷ 5 = 32 ÷ 5 = 6.4 mph

Step 2: Calculate range for each runner

Runner A: 9 - 5 = 4 mph, Runner B: 10 - 4 = 6 mph

Step 3: Compare and conclude

Runner A is faster (7 mph vs 6.4 mph) and more consistent (range 4 vs 6)

Runner A mean: (5+6+7+8+9) ÷ 5
= 7 mph
Runner B mean: (4+5+6+7+10) ÷ 5
= 6.4 mph
Consistency: A range = 4, B range = 6
A more consistent
Runner Speeds Mean Speed Range Performance A 5, 6, 7, 8, 9 7 mph 4 mph Better B 4, 5, 6, 7, 10 6.4 mph 6 mph Less consistent
Runner A: Faster (7 mph) and more consistent (range = 4)
Final answer:

Runner A has a higher average speed (7 mph vs 6.4 mph) and is more consistent (range of 4 vs 6).

Applied rules:

Mean calculation: Sum of values ÷ Count

Range calculation: Maximum - Minimum

Performance evaluation: Compare both center and spread

10 Comprehensive Distribution Analysis
Exercise 10
Analyze and compare these data sets: Set M: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10; Set N: 1, 1, 2, 2, 8, 8, 9, 9, 10, 10. Discuss center, spread, shape, and outliers.
Definition:

Comprehensive analysis: Examining all aspects of a distribution: center, spread, shape, and unusual features.

Step 1: Calculate center measures

Set M: Mean = 5.5, Median = 5.5; Set N: Mean = 5.5, Median = 5.5

Step 2: Calculate spread measures

Set M: Range = 9, IQR = 5; Set N: Range = 9, IQR = 8

Step 3: Analyze shapes

Set M: Uniform distribution; Set N: Bimodal with clustering at extremes

Step 4: Compare distributions

Same center, different spread and shape

Centers: Both means = 5.5
Equal centers
Spreads: M range = 9, N range = 9
Equal ranges
Shapes: M = uniform, N = bimodal
Different shapes
Aspect Set M Set N Mean 5.5 5.5 Median 5.5 5.5 Range 9 9 IQR 5 8 Shape Uniform Bimodal
Same center, different spread and shape
Final answer:

Both sets have the same center (mean and median = 5.5) and range (9), but Set N has a larger IQR (8 vs 5) and a bimodal shape compared to Set M's uniform shape.

Applied rules:

Multiple measures: Compare center, spread, and shape

IQR sensitivity: IQR captures middle 50% spread better than range

Shape importance: Different shapes indicate different patterns

Comprehensive Summary: Comparing Distributions
Core Concepts & Definitions:

Distribution: The pattern of variation in a set of data, showing how data values are arranged.

Center: A typical or representative value in a data set, measured by mean, median, or mode.

Spread: How spread out the data values are, measured by range, interquartile range (IQR), or standard deviation.

Shape: The overall pattern of the data distribution, including symmetry, skewness, and modality.

Mean: The average of all values in a data set, calculated by adding all values and dividing by the count.

Median: The middle value when data is arranged in order; less affected by outliers than the mean.

Mode: The value that appears most frequently in a data set.

Range: The difference between the maximum and minimum values in a data set.

Interquartile Range (IQR): The range of the middle 50% of the data, calculated as Q3 - Q1.

Outlier: A data point that is significantly different from other values in the data set.

Core Rules & Principles:

Essential Principles:

  • Compare the same statistical measures between distributions
  • Consider both center and spread when comparing distributions
  • Identify and discuss outliers when present
  • Analyze the shape of distributions to understand patterns

Key Formulas:

  • Mean = (Sum of all values) ÷ (Number of values)
  • Range = Maximum value - Minimum value
  • IQR = Third quartile (Q3) - First quartile (Q1)
  • Median: Middle value in ordered data set
Step-by-Step Comparison Process:
  1. Calculate center measures: Find mean and median for each distribution
  2. Calculate spread measures: Find range and IQR for each distribution
  3. Analyze shape: Look for symmetry, skewness, and modality
  4. Identify outliers: Look for values significantly different from others
  5. Compare systematically: Contrast each measure between distributions
  6. Draw conclusions: Summarize similarities and differences
Examples & Applications:

Simple Comparison Example:

  • Set A: 10, 12, 14, 16, 18 (Mean = 14, Range = 8)
  • Set B: 8, 10, 12, 14, 16 (Mean = 12, Range = 8)
  • Conclusion: Set A has higher center but same spread

Outlier Impact Example:

  • Set C: 5, 10, 15, 20, 25 (Mean = 15, Median = 15)
  • Set D: 5, 10, 15, 20, 50 (Mean = 20, Median = 15)
  • Conclusion: Outlier increases mean but not median

Shape Difference Example:

  • Set E: 1, 2, 3, 4, 5 (Uniform/symmetric)
  • Set F: 1, 1, 1, 5, 5 (Bimodal)
  • Conclusion: Different shapes despite same range
Tips, Tricks & Common Mistakes:

Tips & Tricks:

  • Always calculate multiple measures (center and spread) for complete comparison
  • Look for outliers before calculating mean, as they can skew results
  • Use median instead of mean when outliers are present
  • Consider the context when interpreting results
  • Visualize data when possible to see patterns more clearly

Common Mistakes:

  • Only comparing centers and ignoring spread
  • Not identifying outliers that affect the mean
  • Comparing different statistical measures between distributions
  • Forgetting to consider the shape of distributions
Key Notes for Memorization:
  • Mean is sensitive to outliers, median is not
  • Range is affected by outliers, IQR is not
  • Always compare like with like (mean vs mean, not mean vs median)
  • Center tells you about typical values
  • Spread tells you about variability
  • Shape tells you about the pattern of distribution
  • IQR focuses on the middle 50% of data
Additional Distribution Comparison Practice
Mean, Median, Range, IQR
Key Statistical Measures
Key definitions:

Distribution comparison: Systematically analyzing and contrasting two or more data sets.

Statistical measures: Quantitative values that describe characteristics of data sets.

Data analysis: The process of examining, cleaning, and interpreting data.

Comparison methodology:
  1. Collect: Organize data from each distribution
  2. Calculate: Compute relevant statistical measures
  3. Compare: Contrast measures between distributions
  4. Interpret: Draw meaningful conclusions
  5. Communicate: Clearly explain findings
Tip 1: Always include units in your final answers.
Tip 2: Round decimal answers to appropriate precision.
Tip 3: Consider how outliers affect different measures of center.
Tip 4: Use both numerical and visual methods to analyze data.
Common errors: Mismatched comparisons, calculation mistakes, ignoring outliers.
Success strategies: Systematic approach, verification, contextual interpretation.
Essential concepts:

• Mean: Arithmetic average

• Median: Middle value

• Mode: Most frequent value

• Range: Max - Min

• IQR: Q3 - Q1

Questions & Answers

Question: When should I use mean versus median to compare distributions?

Answer: Use these guidelines:

  • Use mean: When data is symmetric and without significant outliers
  • Use median: When data is skewed or has outliers that would distort the mean
  • Compare both: When mean and median differ significantly, it indicates skewness

If the mean is much higher than the median, the data is skewed right. If the mean is much lower than the median, the data is skewed left.

Question: How do I know if a value is an outlier?

Answer: A common method is the 1.5×IQR rule:

  • Calculate Q1 (first quartile) and Q3 (third quartile)
  • Find IQR = Q3 - Q1
  • Calculate lower bound = Q1 - 1.5×IQR
  • Calculate upper bound = Q3 + 1.5×IQR
  • Any value below the lower bound or above the upper bound is considered an outlier

This provides a mathematical criterion for identifying outliers.

Question: What's the difference between range and interquartile range? When should I use each?

Answer: The key differences are:

  • Range: Maximum - Minimum (sensitive to outliers)
  • IQR: Q3 - Q1 (focuses on middle 50%, not affected by outliers)
  • Use range: When you want the overall spread including extremes
  • Use IQR: When you want to focus on the core of the data

IQR is more robust and often preferred for comparing the central tendency of spread.