Linear regression: The line of best fit that minimizes the sum of squared residuals. The correlation coefficient r measures the strength of the linear relationship (-1 ≤ r ≤ 1).
- Plot the data points to visualize the relationship
- Calculate means of x and y values
- Compute the slope using: m = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / Σ(xᵢ - x̄)²
- Find y-intercept: b = ȳ - mx̄
- Calculate correlation coefficient: r = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / √[Σ(xᵢ - x̄)² · Σ(yᵢ - ȳ)²]
- Interpret the results
x̄ = (1 + 2 + 3 + 4 + 5)/5 = 15/5 = 3
ȳ = (3 + 5 + 7 + 9 + 11)/5 = 35/5 = 7
Σ[(xᵢ - x̄)(yᵢ - ȳ)] = (1-3)(3-7) + (2-3)(5-7) + (3-3)(7-7) + (4-3)(9-7) + (5-3)(11-7)
= (-2)(-4) + (-1)(-2) + (0)(0) + (1)(2) + (2)(4)
= 8 + 2 + 0 + 2 + 8 = 20
Σ(xᵢ - x̄)² = (1-3)² + (2-3)² + (3-3)² + (4-3)² + (5-3)²
= (-2)² + (-1)² + (0)² + (1)² + (2)²
= 4 + 1 + 0 + 1 + 4 = 10
m = 20/10 = 2
b = ȳ - mx̄ = 7 - 2(3) = 7 - 6 = 1
ŷ = 2x + 1
Σ(yᵢ - ȳ)² = (3-7)² + (5-7)² + (7-7)² + (9-7)² + (11-7)²
= (-4)² + (-2)² + (0)² + (2)² + (4)² = 16 + 4 + 0 + 4 + 16 = 40
r = 20 / √(10 × 40) = 20 / √400 = 20/20 = 1
Correlation: r = 1 (perfect positive correlation)
The linear regression equation is ŷ = 2x + 1, with a correlation coefficient of r = 1, indicating a perfect positive linear relationship.
• Perfect correlation: r = 1 indicates all points lie exactly on the regression line
• Slope interpretation: For each unit increase in x, y increases by 2 units
• Y-intercept: When x = 0, predicted y = 1
Quadratic regression: The parabola of best fit that models data with a curved relationship. Uses the form ŷ = ax² + bx + c.
Using the form ŷ = ax² + bx + c, we substitute each data point:
For (0,2): c = 2
For (1,15): a + b + c = 15 → a + b + 2 = 15 → a + b = 13
For (2,24): 4a + 2b + c = 24 → 4a + 2b + 2 = 24 → 4a + 2b = 22
From a + b = 13: b = 13 - a
Substitute: 4a + 2(13 - a) = 22
4a + 26 - 2a = 22
2a = -4
a = -2
b = 13 - (-2) = 15
Using technology for all 6 points: ŷ = -0.8x² + 8.6x + 2
For a quadratic ax² + bx + c with a < 0, maximum occurs at x = -b/(2a)
x = -8.6/(2(-0.8)) = -8.6/(-1.6) = 5.375 seconds
ŷ(5.375) = -0.8(5.375)² + 8.6(5.375) + 2
= -0.8(28.89) + 46.225 + 2 = -23.11 + 46.225 + 2 = 25.115 ≈ 25.1 feet
Max height: 25.1 ft at 5.375 seconds
The quadratic regression model is ŷ = -0.8x² + 8.6x + 2, predicting a maximum height of approximately 25.1 feet at 5.375 seconds.
• Quadratic vertex: x = -b/(2a) for maximum when a < 0
• Parabolic motion: Gravity creates quadratic trajectory
• Regression coefficients: Determined by minimizing sum of squared errors
Exponential regression: The exponential curve of best fit in the form ŷ = ab^x, where a is initial value and b is growth factor.
Looking at ratios: 120/100 = 1.2, 144/120 = 1.2, 173/144 ≈ 1.2, etc.
This suggests exponential growth with factor b ≈ 1.2
At x = 0, y = 100, so a = 100
Using regression analysis: ŷ = 100(1.2)^x
After 8 hours: ŷ(8) = 100(1.2)^8 = 100(4.2998) ≈ 430 bacteria
Check: ŷ(1) = 100(1.2) = 120 ✓, ŷ(2) = 100(1.44) = 144 ✓
Population after 8 hours: 430 bacteria
The exponential regression model is ŷ = 100(1.2)^x, predicting approximately 430 bacteria after 8 hours.
• Exponential growth: Constant percentage increase
• Growth factor: b > 1 for growth, 0 < b < 1 for decay
• Initial value: y-intercept when x = 0
Regression analysis: Statistical method to find the best-fitting function for a set of data points
Correlation coefficient (r): Measures strength and direction of linear relationship (-1 ≤ r ≤ 1)
Coefficient of determination (r²): Proportion of variance in y explained by x
Residual: Difference between observed and predicted values
Model comparison: Evaluate multiple regression models using correlation coefficients and residual analysis.
12 - 5 = 7, 21 - 12 = 9, 32 - 21 = 11, 45 - 32 = 13
First differences: 7, 9, 11, 13 (not constant)
9 - 7 = 2, 11 - 9 = 2, 13 - 11 = 2
Second differences: 2, 2, 2 (constant)
12/5 = 2.4, 21/12 = 1.75, 32/21 ≈ 1.52, 45/32 ≈ 1.41
Ratios: Not constant
Since second differences are constant, quadratic model is best
Regression analysis confirms: quadratic has r² = 1.00 (perfect fit)
r² = 1.00 (perfect fit)
The quadratic model ŷ = x² + 4x best fits the data with r² = 1.00, indicating a perfect fit.
• Constant second differences: Indicate quadratic relationship
• Model selection: Choose model with highest r²
• Pattern recognition: Use differences and ratios to identify model type
Exponential growth: Characterized by constant percentage increase, common in early product adoption phases.
180/100 = 1.8, 324/180 = 1.8, 583/324 ≈ 1.8, 1050/583 ≈ 1.8
Constant ratio ≈ 1.8 indicates exponential growth
Initial value a = 100, growth factor b = 1.8
Model: ŷ = 100(1.8)^x
After 6 months: ŷ(6) = 100(1.8)^6 = 100(34.01) ≈ 3401 units
Sales after 6 months: 3401 units
The exponential model ŷ = 100(1.8)^x best fits the data, predicting sales of approximately 3401 units after 6 months.
• Constant ratios: Indicate exponential relationship
• Real-world context: Early product growth often follows exponential pattern
• Prediction accuracy: Consider market saturation for long-term projections
Regression: Statistical technique to find the line or curve that best fits a set of data points
Least squares method: Minimizes the sum of squared vertical distances between data points and fitted line
Residual: Observed value minus predicted value (e = y - ŷ)
Coefficient of determination: r² represents the proportion of variance in y explained by x
- Data preparation: Organize data points and check for outliers
- Scatter plot: Visualize the relationship between variables
- Model selection: Choose appropriate function type based on pattern
- Parameter calculation: Use formulas or technology to find regression coefficients
- Model evaluation: Assess fit using correlation coefficient and residual analysis
- Application: Use the model for prediction and interpretation
• Linear regression: ŷ = ax + b
• Quadratic regression: ŷ = ax² + bx + c
• Exponential regression: ŷ = ab^x
• Correlation coefficient: r = Σ[(xᵢ-x̄)(yᵢ-ȳ)] / √[Σ(xᵢ-x̄)² · Σ(yᵢ-ȳ)²]
• Residual: e = y - ŷ
• Coefficient of determination: r² = (correlation coefficient)²
Analysis: The chart compares how well different regression models fit the same data.
- Linear: Simplest model, may not capture curvature
- Quadratic: Better fit for curved data
- Exponential: Good for rapidly increasing data