Class: MoreMath::LinearRegression
- Defined in:
- lib/more_math/linear_regression.rb
Overview
This class computes a linear regression for the given image and domain data sets.
Linear regression is a statistical method that models the relationship between a dependent variable (image) and one or more independent variables (domain). It fits a linear equation to observed data points to make predictions or understand relationships.
The implementation uses the least squares method to find the best-fit line y = ax + b, where ‘a’ is the slope and ‘b’ is the y-intercept.
Instance Attribute Summary collapse
-
#a ⇒ Float
readonly
The slope of the line.
-
#b ⇒ Float
readonly
The offset of the line.
-
#domain ⇒ Array<Numeric>
readonly
The domain data as an array.
-
#image ⇒ Array<Numeric>
readonly
The image data as an array.
Instance Method Summary collapse
-
#compute ⇒ self
private
Computes the linear regression parameters using least squares method.
-
#initialize(image, domain = (0...image.size).to_a) ⇒ LinearRegression
constructor
Creates a new LinearRegression instance with image and domain data.
-
#r2 ⇒ Float
Returns the coefficient of determination (R²).
-
#residuals ⇒ Array<Float>
Returns the residuals of this linear regression.
-
#slope_zero?(alpha = 0.05) ⇒ Boolean
Checks if the slope is significantly different from zero.
-
#tvalue ⇒ Float
private
Calculates the t-value for testing slope significance.
Constructor Details
#initialize(image, domain = (0...image.size).to_a) ⇒ LinearRegression
Creates a new LinearRegression instance with image and domain data.
Initializes the linear regression model using the provided data points. The domain data represents independent variables (x-values) and the image data represents dependent variables (y-values).
46 47 48 49 50 51 |
# File 'lib/more_math/linear_regression.rb', line 46 def initialize(image, domain = (0...image.size).to_a) image.size != domain.size and raise ArgumentError, "image and domain have unequal sizes" @image, @domain = image, domain compute end |
Instance Attribute Details
#a ⇒ Float (readonly)
The slope of the line.
Returns the calculated slope (a) of the best-fit line y = ax + b.
72 73 74 |
# File 'lib/more_math/linear_regression.rb', line 72 def a @a end |
#b ⇒ Float (readonly)
The offset of the line.
Returns the calculated y-intercept (b) of the best-fit line y = ax + b.
79 80 81 |
# File 'lib/more_math/linear_regression.rb', line 79 def b @b end |
#domain ⇒ Array<Numeric> (readonly)
The domain data as an array.
Returns the independent variable values used in the regression.
65 66 67 |
# File 'lib/more_math/linear_regression.rb', line 65 def domain @domain end |
#image ⇒ Array<Numeric> (readonly)
The image data as an array.
Returns the dependent variable values used in the regression.
58 59 60 |
# File 'lib/more_math/linear_regression.rb', line 58 def image @image end |
Instance Method Details
#compute ⇒ self (private)
Computes the linear regression parameters using least squares method.
This internal method calculates the slope (a) and intercept (b) coefficients by solving the normal equations derived from minimizing the sum of squared residuals.
151 152 153 154 155 156 157 158 159 160 161 162 163 |
# File 'lib/more_math/linear_regression.rb', line 151 def compute size = @image.size sum_xx = sum_xy = sum_x = sum_y = 0.0 @domain.zip(@image) do |x, y| sum_xx += x ** 2 sum_xy += x * y sum_x += x sum_y += y end @a = (size * sum_xy - sum_x * sum_y) / (size * sum_xx - sum_x ** 2) @b = (sum_y - @a * sum_x) / size self end |
#r2 ⇒ Float
Returns the coefficient of determination (R²).
R² measures the proportion of the variance in the dependent variable that is predictable from the independent variable(s). It ranges from 0 to 1, where higher values indicate better fit.
133 134 135 136 137 138 139 140 |
# File 'lib/more_math/linear_regression.rb', line 133 def r2 image_seq = MoreMath::Sequence.new(@image) sum_res = residuals.inject(0.0) { |s, r| s + r ** 2 } [ 1.0 - sum_res / image_seq.sum_of_squares, 0.0, ].max end |
#residuals ⇒ Array<Float>
Returns the residuals of this linear regression.
Residuals are the differences between observed values and predicted values from the regression line. They represent the error in prediction for each data point.
115 116 117 118 119 120 121 |
# File 'lib/more_math/linear_regression.rb', line 115 def residuals result = [] @domain.zip(@image) do |x, y| result << y - (@a * x + @b) end result end |
#slope_zero?(alpha = 0.05) ⇒ Boolean
Checks if the slope is significantly different from zero.
Performs a t-test to determine whether the slope coefficient is statistically significant at the given significance level (alpha). This test helps determine if there’s a meaningful linear relationship between the independent and dependent variables.
96 97 98 99 100 101 102 103 |
# File 'lib/more_math/linear_regression.rb', line 96 def slope_zero?(alpha = 0.05) (0..1) === alpha or raise ArgumentError, 'alpha should be in 0..100' df = @image.size - 2 return true if df <= 0 # not enough values to check t = tvalue td = TDistribution.new df t.abs <= td.inverse_probability(1 - alpha.abs / 2.0).abs end |
#tvalue ⇒ Float (private)
Calculates the t-value for testing slope significance.
This internal method computes the t-statistic used in hypothesis testing to determine if the slope differs significantly from zero.
171 172 173 174 175 176 177 178 179 180 181 182 183 |
# File 'lib/more_math/linear_regression.rb', line 171 def tvalue df = @image.size - 2 return 0.0 if df <= 0 sse_y = 0.0 @domain.zip(@image) do |x, y| f_x = a * x + b sse_y += (y - f_x) ** 2 end mean = @image.inject(0.0) { |s, y| s + y } / @image.size sse_x = @domain.inject(0.0) { |s, x| s + (x - mean) ** 2 } t = a / (Math.sqrt(sse_y / df) / Math.sqrt(sse_x)) t.nan? ? 0.0 : t end |