Class: Bullshit::Analysis

arithmetic_mean() click to toggle source

Returns the arithmetic mean of the measurements.

# File lib/bullshit.rb, line 1080
    def arithmetic_mean
      @arithmetic_mean ||= sum / size
    end

Also aliased as: mean

autocorrelation() click to toggle source

Returns the array of autocorrelation values c_k / c_0 (of length size - 1).

# File lib/bullshit.rb, line 1252
    def autocorrelation
      c = autovariance
      Array.new(c.size) { |k| c[k] / c[0] }
    end

autovariance() click to toggle source

Returns the array of autovariances (of length size - 1).

# File lib/bullshit.rb, line 1240
    def autovariance
      Array.new(size - 1) do |k|
        s = 0.0
        0.upto(size - k - 1) do |i|
          s += (@measurements[i] - arithmetic_mean) * (@measurements[i + k] - arithmetic_mean)
        end
        s / size
      end
    end

common_standard_deviation(other) click to toggle source

Returns an estimation of the common standard deviation of the measurements of this and other.

# File lib/bullshit.rb, line 1182
    def common_standard_deviation(other)
      Math.sqrt(common_variance(other))
    end

common_variance(other) click to toggle source

Returns an estimation of the common variance of the measurements of this and other.

# File lib/bullshit.rb, line 1188
    def common_variance(other)
      (size - 1) * sample_variance + (other.size - 1) * other.sample_variance /
        (size + other.size - 2)
    end

compute_student_df(other) click to toggle source

Compute the # degrees of freedom for Student’s t-test.

# File lib/bullshit.rb, line 1194
    def compute_student_df(other)
      size + other.size - 2
    end

compute_welch_df(other) click to toggle source

Use an approximation of the Welch-Satterthwaite equation to compute the degrees of freedom for Welch’s t-test.

# File lib/bullshit.rb, line 1163
    def compute_welch_df(other)
      (sample_variance / size + other.sample_variance / other.size) ** 2 / (
        (sample_variance ** 2 / (size ** 2 * (size - 1))) +
        (other.sample_variance ** 2 / (other.size ** 2 * (other.size - 1))))
    end

confidence_interval(alpha = 0.05) click to toggle source

Return the confidence interval for the arithmetic mean with alpha level alpha of the measurements of this Analysis instance as a Range object.

# File lib/bullshit.rb, line 1232
    def confidence_interval(alpha = 0.05)
      td = TDistribution.new(size - 1)
      t = td.inverse_probability(alpha / 2).abs
      delta = t * sample_standard_deviation / Math.sqrt(size)
      (arithmetic_mean - delta)..(arithmetic_mean + delta)
    end

cover?(other, alpha = 0.05) click to toggle source

Return true, if the Analysis instance covers the other, that is their arithmetic mean value is most likely to be equal for the alpha error level.

# File lib/bullshit.rb, line 1224
    def cover?(other, alpha = 0.05)
      t = t_welch(other)
      td = TDistribution.new(compute_welch_df(other))
      t.abs < td.inverse_probability(1 - alpha.abs / 2.0)
    end

detect_autocorrelation(lags = 20, alpha_level = 0.05) click to toggle source

This method tries to detect autocorrelation with the Ljung-Box statistic. If enough lags can be considered it returns a hash with results, otherwise nil is returned. The keys are

:lags:	the number of lags,
:alpha_level:	the alpha level for the test,
:q:	the value of the ljung_box_statistic,
:p:	the p-value computed, if p is higher than alpha no correlation was detected,
:detected:	true if a correlation was found.

# File lib/bullshit.rb, line 1285
    def detect_autocorrelation(lags = 20, alpha_level = 0.05)
      if q = ljung_box_statistic(lags)
        p = ChiSquareDistribution.new(lags).probability(q)
        return {
          :lags         => lags,
          :alpha_level  => alpha_level,
          :q            => q,
          :p            => p,
          :detected     => p >= 1 - alpha_level,
        }
      end
    end

detect_outliers(factor = 3.0, epsilon = 1E-5) click to toggle source

Return a result hash with the number of :very_low, :low, :high, and :very_high outliers, determined by the box plotting algorithm run with :median and :iqr parameters. If no outliers were found or the iqr is less than epsilon, nil is returned.

# File lib/bullshit.rb, line 1302
    def detect_outliers(factor = 3.0, epsilon = 1E-5)
      half_factor = factor / 2.0
      quartile1 = percentile(25)
      quartile3 = percentile(75)
      iqr = quartile3 - quartile1
      iqr < epsilon and return
      result = @measurements.inject(Hash.new(0)) do |h, t|
        extreme =
          case t
          when -Infinity..(quartile1 - factor * iqr)
            :very_low
          when (quartile1 - factor * iqr)..(quartile1 - half_factor * iqr)
            :low
          when (quartile1 + half_factor * iqr)..(quartile3 + factor * iqr)
            :high
          when (quartile3 + factor * iqr)..Infinity
            :very_high
          end and h[extreme] += 1
        h
      end
      unless result.empty?
        result[:median] = median
        result[:iqr] = iqr
        result[:factor] = factor
        result
      end
    end

durbin_watson_statistic() click to toggle source

Returns the d-value for the Durbin-Watson statistic. The value is d << 2 for positive, d >> 2 for negative and d around 2 for no autocorrelation.

# File lib/bullshit.rb, line 1259
    def durbin_watson_statistic
      e = linear_regression.residues
      e.size <= 1 and return 2.0
      (1...e.size).inject(0.0) { |s, i| s + (e[i] - e[i - 1]) ** 2 } /
        e.inject(0.0) { |s, x| s + x ** 2 }
    end

geometric_mean() click to toggle source

Returns the geometric mean of the measurements. If any of the measurements is less than 0.0, this method returns NaN.

# File lib/bullshit.rb, line 1103
    def geometric_mean
      @geometric_mean ||= (
        sum = @measurements.inject(0.0) { |s, t|
          case
          when t > 0
            s + Math.log(t)
          when t == 0
            break :null
          else
            break nil
          end
        }
        case sum
        when :null
          0.0
        when Float
          Math.exp(sum / size)
        else
          0 / 0.0
        end
      )
    end

harmonic_mean() click to toggle source

Returns the harmonic mean of the measurements. If any of the measurements is less than or equal to 0.0, this method returns NaN.

# File lib/bullshit.rb, line 1088
    def harmonic_mean
      @harmonic_mean ||= (
        sum = @measurements.inject(0.0) { |s, t|
          if t > 0
            s + 1.0 / t
          else
            break nil
          end
        }
        sum ? size / sum : 0 / 0.0
      )
    end

histogram(bins) click to toggle source

Returns a Histogram instance with bins as the number of bins for this analysis’ measurements.

# File lib/bullshit.rb, line 1338
    def histogram(bins)
      Histogram.new(self, bins)
    end

linear_regression() click to toggle source

Returns the LinearRegression object for the equation a * x + b which represents the line computed by the linear regression algorithm.

# File lib/bullshit.rb, line 1332
    def linear_regression
      @linear_regression ||= LinearRegression.new @measurements
    end

ljung_box_statistic(lags = 20) click to toggle source

Returns the q value of the Ljung-Box statistic for the number of lags lags. A higher value might indicate autocorrelation in the measurements of this Analysis instance. This method returns nil if there weren’t enough (at least lags) lags available.

# File lib/bullshit.rb, line 1270
    def ljung_box_statistic(lags = 20)
      r = autocorrelation
      lags >= r.size and return
      n = size
      n * (n + 2) * (1..lags).inject(0.0) { |s, i| s + r[i] ** 2 / (n - i) }
    end

max() click to toggle source

Returns the maximum of the measurements.

# File lib/bullshit.rb, line 1132
    def max
      @max ||= @measurements.max
    end

mean() click to toggle source

Alias for arithmetic_mean

median(p = 50) click to toggle source

Alias for percentile

min() click to toggle source

Returns the minimum of the measurements.

# File lib/bullshit.rb, line 1127
    def min
      @min ||= @measurements.min
    end

percentile(p = 50) click to toggle source

Returns the p-percentile of the measurements. There are many methods to compute the percentile, this method uses the the weighted average at x_(n + 1)p, which allows p to be in 0...100 (excluding the 100).

# File lib/bullshit.rb, line 1140
    def percentile(p = 50)
      (0...100).include?(p) or
        raise ArgumentError, "p = #{p}, but has to be in (0...100)"
      p /= 100.0
      @sorted ||= @measurements.sort
      r = p * (@sorted.size + 1)
      r_i = r.to_i
      r_f = r - r_i
      if r_i >= 1
        result = @sorted[r_i - 1]
        if r_i < @sorted.size
          result += r_f * (@sorted[r_i] - @sorted[r_i - 1])
        end
      else
        result = @sorted[0]
      end
      result
    end

Also aliased as: median

sample_standard_deviation() click to toggle source

Returns the sample standard deviation of the measurements.

# File lib/bullshit.rb, line 1064
    def sample_standard_deviation
      @sample_standard_deviation ||= Math.sqrt(sample_variance)
    end

sample_standard_deviation_percentage() click to toggle source

Returns the sample standard deviation of the measurements in percentage of the arithmetic mean.

# File lib/bullshit.rb, line 1070
    def sample_standard_deviation_percentage
      @sample_standard_deviation_percentage ||= 100.0 * sample_standard_deviation / arithmetic_mean
    end

sample_variance() click to toggle source

Returns the sample_variance of the measurements.

# File lib/bullshit.rb, line 1042
    def sample_variance
      @sample_variance ||= size > 1 ? sum_of_squares / (size - 1.0) : 0.0
    end

size() click to toggle source

Returns the number of measurements, on which the analysis is based.

# File lib/bullshit.rb, line 1032
    def size
      @measurements.size
    end

standard_deviation() click to toggle source

Returns the standard deviation of the measurements.

# File lib/bullshit.rb, line 1053
    def standard_deviation
      @sample_deviation ||= Math.sqrt(variance)
    end

standard_deviation_percentage() click to toggle source

Returns the standard deviation of the measurements in percentage of the arithmetic mean.

# File lib/bullshit.rb, line 1059
    def standard_deviation_percentage
      @standard_deviation_percentage ||= 100.0 * standard_deviation / arithmetic_mean
    end

suggested_sample_size(other, alpha = 0.05, beta = 0.05) click to toggle source

Compute a sample size, that will more likely yield a mean difference between this instance’s measurements and those of other. Use alpha and beta as levels for the first- and second-order errors.

# File lib/bullshit.rb, line 1211
    def suggested_sample_size(other, alpha = 0.05, beta = 0.05)
      alpha, beta = alpha.abs, beta.abs
      signal = arithmetic_mean - other.arithmetic_mean
      df = size + other.size - 2
      pooled_variance_estimate = (sum_of_squares + other.sum_of_squares) / df
      td = TDistribution.new df
      (((td.inverse_probability(alpha) + td.inverse_probability(beta)) *
        Math.sqrt(pooled_variance_estimate)) / signal) ** 2
    end

sum() click to toggle source

Returns the sum of all measurements.

# File lib/bullshit.rb, line 1075
    def sum
      @sum ||= @measurements.inject(0.0) { |s, t| s + t }
    end

sum_of_squares() click to toggle source

Returns the sum of squares (the sum of the squared deviations) of the measurements.

# File lib/bullshit.rb, line 1048
    def sum_of_squares
      @sum_of_squares ||= @measurements.inject(0.0) { |s, t| s + (t - arithmetic_mean) ** 2 }
    end

t_student(other) click to toggle source

Returns the t value of the Student’s t-test between this Analysis instance and the other.

# File lib/bullshit.rb, line 1200
    def t_student(other)
      signal = arithmetic_mean - other.arithmetic_mean
      noise = common_standard_deviation(other) *
        Math.sqrt(size ** -1 + size ** -1)
    rescue Errno::EDOM
      0.0
    end

t_welch(other) click to toggle source

Returns the t value of the Welch’s t-test between this Analysis instance and the other.

# File lib/bullshit.rb, line 1171
    def t_welch(other)
      signal = arithmetic_mean - other.arithmetic_mean
      noise = Math.sqrt(sample_variance / size +
        other.sample_variance / other.size)
      signal / noise
    rescue Errno::EDOM
      0.0
    end

variance() click to toggle source

Returns the variance of the measurements.

# File lib/bullshit.rb, line 1037
    def variance
      @variance ||= sum_of_squares / size
    end

In Files

Parent

Methods

Files

Class Index

Bullshit::Analysis

Attributes

Public Class Methods

Public Instance Methods