Class: MoreMath::Histogram

Inherits:
Object show all
Defined in:
lib/more_math/histogram.rb

Overview

Represents a histogram for visualizing data distributions

The Histogram class provides functionality to create and display histograms from sequences of numerical data. It divides the data into bins and counts how many elements fall into each bin, then displays this information in a readable format with optional UTF-8 bar characters.

Examples:

Creating a histogram

sequence = [1, 2, 3, 4, 5, 1]
hist = Histogram.new(sequence, bins: 3)

Displaying a histogram

hist.display($stdout, 80)

Defined Under Namespace

Classes: Bin

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(sequence, arg = 10) ⇒ Histogram

Create a Histogram for the elements of sequence with bins bins.

Parameters:

  • sequence (Enumerable)

    The sequence to build the histogram from

  • arg (Integer, Hash) (defaults to: 10)

    Number of bins or hash with options like ‘:bins` and `:with_counts`

Options Hash (arg):

  • :bins (Integer) — default: 10

    Number of bins to use

  • :with_counts (Boolean) — default: false

    Whether to display counts in output



35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/more_math/histogram.rb', line 35

def initialize(sequence, arg = 10)
  @with_counts = false
  if arg.is_a?(Hash)
    bins = arg.fetch(:bins, 10)
    wc = arg[:with_counts] and @with_counts = wc
  else
    bins = arg
  end
  @sequence = sequence
  @bins = bins
  @result = compute
end

Instance Attribute Details

#binsInteger (readonly)

Number of bins for this Histogram.

Returns:

  • (Integer)


51
52
53
# File 'lib/more_math/histogram.rb', line 51

def bins
  @bins
end

Instance Method Details

#ascii_bar(bar_width) ⇒ String (private)

Generate ASCII bar character representation based on width.

Parameters:

  • bar_width (Float)

    Width of the bar

Returns:

  • (String)


130
131
132
# File 'lib/more_math/histogram.rb', line 130

def ascii_bar(bar_width)
  ?* * bar_width
end

#computeArray<Bin> (private)

Computes the histogram and returns it as an array of tuples (l, c, r).

Returns:



206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
# File 'lib/more_math/histogram.rb', line 206

def compute
  @sequence.empty? and return []
  last_r = -Infinity
  min = @sequence.min
  max = @sequence.max
  step = (max - min) / bins.to_f
  Array.new(bins) do |i|
    l = min + i  * step
    r = min + (i + 1) * step
    c = 0
    @sequence.each do |x|
      x > last_r and (x <= r || i == bins - 1) and c += 1
    end
    last_r = r
    Bin.new(l, r, c)
  end
end

#countsArray<Integer>

Get an array of counts from each bin.

Returns:

  • (Array<Integer>)


71
72
73
# File 'lib/more_math/histogram.rb', line 71

def counts
  each_bin.map(&:count)
end

#display(output = $stdout, width = 65) ⇒ self

Display this histogram to output using width columns. Raises ArgumentError if width < 15.

Parameters:

  • output (IO) (defaults to: $stdout)

    The output stream to write to (default: $stdout)

  • width (Integer, String) (defaults to: 65)

    Width of the display; can be a percentage string like “90%”

Returns:

  • (self)

Raises:

  • (ArgumentError)

    If width is less than 15



82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/more_math/histogram.rb', line 82

def display(output = $stdout, width = 65)
  if width.is_a?(String) && width =~ /(.+)%\z/
    percentage = Float($1).clamp(0, 100)
    width = (terminal_width * (percentage / 100.0)).floor
  end
  width > 15 or raise ArgumentError, "width needs to be >= 15"
  for r in rows
    output << output_row(r, width)
  end
  output << "max_count=#{max_count}\n"
  self
end

#each_bin {|Bin| ... } ⇒ Array<Bin>

Iterate over each bin in the histogram.

Yields:

  • (Bin)

    each bin

Returns:



64
65
66
# File 'lib/more_math/histogram.rb', line 64

def each_bin(&block)
  @result.each(&block)
end

#max_countInteger

Get the maximum count in any bin.

Returns:

  • (Integer)


105
106
107
# File 'lib/more_math/histogram.rb', line 105

def max_count
  counts.max
end

#output_row(row, width) ⇒ String (private)

Format a single row of histogram data for output.

Parameters:

  • row (Array)

    A tuple containing [left, right, count]

  • width (Integer)

    Width of the bar display area

Returns:

  • (String)


146
147
148
149
150
151
152
153
# File 'lib/more_math/histogram.rb', line 146

def output_row(row, width)
  left, right, count = row
  if @with_counts
    output_row_with_count(left, right, count, width)
  else
    output_row_without_count(left, right, count, width)
  end
end

#output_row_with_count(left, right, count, width) ⇒ String (private)

Output a row with counts.

Parameters:

  • left (Float)

    Left boundary of bin

  • right (Float)

    Right boundary of bin

  • count (Integer)

    Count in bin

  • width (Integer)

    Width of bar display area

Returns:

  • (String)


162
163
164
165
166
167
168
169
170
171
172
173
174
175
# File 'lib/more_math/histogram.rb', line 162

def output_row_with_count(left, right, count, width)
  width -= 15
  c = utf8? ? 2 : 1
  left_width = width - (counts.map { |x| x.to_s.size }.max + c)
  if left_width < 0
    left_width = width
  end
  factor    = left_width.to_f / max_count
  bar_width = (count * factor)
  bar = utf8? ? utf8_bar(bar_width) : ascii_bar(bar_width)
  max_count_length = max_count.to_s.size
  "%11.5f -|%#{-width + max_count_length}s%#{max_count_length}s\n" %
    [ (left + right) / 2.0, bar, count ]
end

#output_row_without_count(left, right, count, width) ⇒ String (private)

Output a row without counts.

Parameters:

  • left (Float)

    Left boundary of bin

  • right (Float)

    Right boundary of bin

  • count (Integer)

    Count in bin

  • width (Integer)

    Width of bar display area

Returns:

  • (String)


184
185
186
187
188
189
190
191
192
# File 'lib/more_math/histogram.rb', line 184

def output_row_without_count(left, right, count, width)
  width -= 15
  left_width = width
  left_width < 0 and left_width = width
  factor    = left_width.to_f / max_count
  bar_width = (count * factor)
  bar = utf8? ? utf8_bar(bar_width) : ascii_bar(bar_width)
  "%11.5f -|%#{-width}s\n" % [ (left + right) / 2.0, bar ]
end

#rowsArray<Array> (private)

Returns rows for display.

Returns:

  • (Array<Array>)


197
198
199
200
201
# File 'lib/more_math/histogram.rb', line 197

def rows
  @result.reverse_each.map { |bin|
    [ bin.left, bin.right, bin.count ]
  }
end

#terminal_widthInteger

Get terminal width using Tins::Terminal.

Returns:

  • (Integer)


98
99
100
# File 'lib/more_math/histogram.rb', line 98

def terminal_width
  Tins::Terminal.columns
end

#to_aArray<Bin>

Return the computed histogram as an array of Bin objects.

Returns:



56
57
58
# File 'lib/more_math/histogram.rb', line 56

def to_a
  @result
end

#utf8?Boolean (private)

Determine if UTF-8 is enabled in the environment.

Returns:

  • (Boolean)


137
138
139
# File 'lib/more_math/histogram.rb', line 137

def utf8?
  ENV['LANG'] =~ /utf-8\z/i
end

#utf8_bar(bar_width) ⇒ String (private)

Generate UTF-8 bar character representation based on width.

Parameters:

  • bar_width (Float)

    Width of the bar

Returns:

  • (String)


115
116
117
118
119
120
121
122
123
124
# File 'lib/more_math/histogram.rb', line 115

def utf8_bar(bar_width)
  fract = bar_width - bar_width.floor
  bar   = ?⣿ * bar_width.floor
  if fract > 0.5
    bar << ?⡇
  else
    bar << ' '
  end
  bar
end